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ABSTRACT 


The process of finding an exact minimization for a multiple-valued logic (MVL) 
expression requires an extensive search and enormous computation time. One of the heuristics 
to reduce this computation time is the Neighborhood Decoupling (ND) Algorithm by Yang and 
Wang. This algorithm finds near-optimal solutions for the given MVL expressions. The ND 
algorithm is an extension of HAMLET (Heuristic Analyzer for Multiple-valued Logic 
Expressions). 

The primary goal of this thesis is to reduce the computation time of the ND algorithm 
by using parallel processors. We developed a parallel version of the ND algorithm and tested 
it on an iPSC/2 (Intel Parallel Supercomputer). The parallel version of the ND Algorithm 
actually executes in parallel a portion of the ND algorithm known as the clustering factor 
calculation. The number of nodes needed to run the programs is twice the number of input 
variables of the expression. The results indicate that the parallel version of ND algorithm halves 
the computation time compared to the sequential version. 

A secondary goal of this thesis is to initiate the parallelization of HAMLET and the 
study of parallel computers, i.e. iPSC/2. The experiences we obtained with iPSC/2 suggest an 
alternative algorithm. The ND algorithm searches the first branch of the search tree assuming 
that the optimum solution will be on that branch. We developed a Multi-branch Concurrent ND 
(MCND) algorithm which concurrently searches multiple branches, hence increasing the 


probability of reaching the optimum. 
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I. INTRODUCTION 


А. MOTIVATION 

Very-large-scale-integration (VLSI) technology has matured to a point where 
large logic circuits are economically realized in silicon. However, two major 
problems, bus connection and pin limitation, are bottlenecks to further integration. 
Multiple-valued logic offers a solution to these problems. In recent years, multiple- 
valued logic has been used in programmable logic arrays (PLA) based on charge- 
coupled devices (CCD) or current-mode CMOS [Ref. 1, 2, 3, 4]. PLA’s provide a 
structured and modular approach to logic design. Consequently, there has been 
considerable interest in computer-aided design and logic synthesis tools for multiple- 
valued PLA’s. 

Several heuristic algorithms have been developed for the multiple-valued logic 
minimization and each claims some advantages in specific examples, but none of 
them is consistently better than the others [Ref. 5, 6, 7, 8, 9]. Heuristic algorithms 
are important because the only known algorithms guaranteed to find a minimal 
solution require an enormous search and are extremely time consuming. A heuristic 
called the Neighborhood Decoupling Algorithm (ND) has been developed at the 
Naval Postgraduate School (NPS)[Ref. 10]. This algorithm finds near minimal 
solutions for given MVL expressions. However, for large PLA’s, computation time 


needed is also large. 


This thesis shows how to reduce the computation time needed to minimize 
multiple-valued logic expressions by using parallel computers. Specifically, a parallel 
version of the Neighborhood Decoupling Algorithm is implemented by using 


concurrent C and is run on iPSC/2 (Intel Personal Supercomputer). 


B. BACKGROUND 

With the computer software developed at NPS called HAMLET (Heuristic 
Analyzer for Multiple-valued Logic Expression Translation), users can investigate 
heuristics of their own [Ref. 12]. The HAMLET execution procedure of these 
algorithms is abstracted as follows. Formal definitions will be covered in the next 
chapter. Let f be a multiple-valued function, and let a be a minterm of f. 
ЖЖЖЖЖЖЖЖЖЖЖЖ УЖ ЖЖ ЖЖ ЖЖ ЖЖ ЖЖ ЖЖ ЖЖ ЖЖ ЖЖ ЖЖ ЖЖ ЖЖ ЖЖ ЖЖ ЖЖ 

Input: let the M be the set of minterms of a function f; 


Output: the minimized sum of product, S, of the original function; 


ЖЖЖЖЖЖЖЖЖЖЖЖЖЖЖ ЖЖ ЖЖЖ жж жж ж жн к ж ж кекке ктк кам ИН LI 
5 <- ф. 
While (M = фу до | 
pick one minterm a from M; 
find an implicant I, which covers a; 
S «J uS; 


subtract 7, from f; 


) 


TABLE 1.1: SUMMARY OF FOUR HEURISTIC ALGORITHMS 


Heuristic Algorithm Choice of Choice of Implicant 
Minterm 
! Pomper and Armstrong [Ref.5] Random Drives Most Minterms to 
| (1981) 0 or don t-care | 
Besslich [Ref.6] Smallest Weight | Drives Most Minterms to 
(1986) (Most Isolated) 0 or don't-care 


Dueck and Miller [Ref.7] Largest IF Largest BCR 
(1988) (Most Isolated) 


Yang and Wang [Ref.10] Smallest CF Smallest NRC 
(1989) (Most Isolated) 


TABLE 1.1 shows four previously proposed algorithms. They differ from each 
















other in the manner of picking the minterms (a) and finding the implicants (1,). Тһе 
Neighborhood Decoupling Algorithm developed by Yang and Wang is a modified 
version of Dueck and Miller's. All of these algorithms initiate a search procedure 
for æ and evaluate the input function expression f at minterm a. Next, an implicant 
[, is chosen which covers а. Then, implicant J, is added to output solution set S, and 
І, is subtracted from function f. 

The Pomper and Armstrong heuristic picks a randomly (as long as a is in the 
set of minterms M) and finds an J, (as long as J, covers a) which drives the most 
minterms to 0 or don’t-care when I, subtracted from function f [Ref. 5]. In 1986, 
Besslich presented an algorithm, using to weight transformations. The Besslich 
algorithm picks a with the smallest weight (most isolated minterm) and finds 7, 


which has a lowest cost per minterm covered (i.e., which drives the most minterms 


to 0 or don't care)[Ref. 6]. In 1988, Dueck and Miller presented another algorithm 
that picks a. from M if a has the highest isolated factor (IF) and then finds the 7, 
which directly covers a such that the break count reduction (BCR) is maximum 
[Ref. 7]. The ND algorithm by Yang and Wang is an improvement to the Dueck 
and Miller algorithm with revised decision rules for making selections of minterms 
and implicants. The ND algorithm is characterized by adopting the advantage of 
each algorithm and fully utilizing the properties of the truncated sums. Parallel 
Neighborhood Decoupling (PND) algorithm is the parallel version of the ND 


algorithm. 


C. THESIS OUTLINE 

A summary of MVL definitions for truncated sum minimization are introduced 
in Chapter II. The notations and definitions of Chapter II also help us in explaining 
the algorithms in subsequent chapters. The computer system, iPSC/2, that is used 
for developing the Parallel Neighborhood Decoupling algorithm is presented in 
Chapter III. Chapter IV and V discuss the computation times of the sequential and 


parallel versions of the ND algorithm. 


П. NOTATIONS AND DEFINITIONS 
The definition for truncated sum MVL minimization is given by Yang and 


Wang algorithm [Ref. 10, 11], and we use them here. 


A. DEFINITIONS FOR TRUNCATED SUM 
Definition 1: 

Let X = { X,,x,,....x, } be a set of n input variables where x, takes on values 
from R = { 0,1,...,r-1 }. An n-variable r-valued function f is a mapping 

7: Е" > Ко {1}. [Ref. 9] 

Here, r is a don't-care value; it can be chosen freely from any of the logic 
values, 0,1,...,r-1. 
Definition 2: MIN 

The MIN [Ref. 9] function, is denoted as f(x,,x,) = X,xX,, which evaluates to the 
minimum value of its arguments. For example, if R = {0,1,2,3}, then f(1,2) = l and 
f(0,3) 2 0. A: minterm is an assignment of values to x,,x,,...,x, such that f(x) # 0. 
Definition 3: Literal 

The literal operation of a variable x is defined as: 


TE ї-1 а<х< р (2.1) 
O otherwise. 


Definition 4: Truncated Sum (TSUM) 

Тһе truncated sum (TSUM) operation is defined as: 

TSUM(Xx,,x,) = x, + x, = min(x, + х, - 1). (272) 

The two + signs in this expression are different. The leftmost denotes the 
TSUM operation, while the rightmost denotes ordinary addition of two logic values 
which are viewed as integers. For example, if R = {0,1,2,3}, then TSUM(1,2) = 3 
and TSUM(2,2) = 3. The TSUM obeys the associative and commutative rules. 

These definitions are inspired by the fact that CCD implementation supports 
TSUM naturally [Ref. 9]. 
Example 1: 

For example, 'x is a literal and takes value of 3 when 1 < x, < 3. However, 
function 2 'x,’ takes a value of 2 based on the definition of MIN. 
Definition 5: Product Term 

A product term p is the MIN of one nonzero constant c € R, and one or more 
literal functions. In general, a product term is defined as: 


ay, gies: ae in Jn | lk $ Jk (2.3) 
PCI PUT TEN MT reco 


The constant or coefficient c, in a product term, effectively scales the term. For 


ir DE 


each variable x, we say the window size of the literal ^*x; is j,-1, + 1. We 


use the terms product term and implicant interchangeably in this thesis. 


Definition 6: Minterm 

A minterm a is a product term in which all literals have a window size of 1. 
For example, product term 2 ?x;? x? is also a minterm. We say the coordinate of a 
15 <а,,а,,...а, >. We denote the value of minterm a, g(a), as the nonzero constant 
С. 

A product term p= c hy fay J? | se tay J " сап be decomposed into 

II... (x - ix + 1) minterms. We say p generated those minterms. Given a 

product term p, the set of minterms generated from p is denoted by MS,. If the 
number of elements in MS, is greater than that in MS,» we say p, covers a larger 
area than p,. Given a function f, the set of minterms generated from its product 
terms is denoted by MS,. 
Definition 7: Sum-of-Products Expression 

A sum-of-products expression is p, + p, + ... + py for some integer N, where 
p; is a product term. For example, f = 3 1x? !xj«2 ?x] 9xj43 1x] !xj is 
a sum-of-products expression. 
Definition 8: Saturated Minterms (SAT) 

Given a minterm a generated from the original function to be minimized, if 


g(a) — r - 1, then a is a saturated minterm. Let SAT be the set of all saturated 


minterms of a function. 


Example 2: 
If the input function to be minimized is expressed as follows, 
f-3 lx) ixle2 Sx? бж, По Пк И ар хе MER 
the MS, can be represented as 15 minterms in Figure 2.1. We mark a saturated 


minterm with a dot in the figure. 





Figure 2.1: Map for Example 2, 3 ,4; Step 1 of Table 3.2 


Lemma 1 Given a minterm a the maximum number of implicants which covers a is 
O(r^). 

Proof: Consider a variable (axis) x, of a. Any implicant (I,) that covers a may have 
a range or "window size" w, such that 1 < w < r. With a window size w, we may 
have w implicants that covers a. That is, for a given position a, within a window, 
there are (a+1) ways to choose a lower bound on the window (0, J,..., a) and r-1- 


a+ 1 ways to choose the upper bound, for a total of (a+ 1)(r-a) ways - which achieves 


а 2 
a maximum of about к when a = E | 


B. THE PROPERTIES OF TRUNCATED SUM 
There are two important properties of the truncated sum which are useful later 


in developing the ND algorithm. 


].  Saturated minterms can be generated by TSUM operation. 

The truncated sum of two or more minterms may produce a 
saturated minterm. By definition 4, the truncated sum of any saturated 
minterm and a minterm identical except for the coefficient is a saturated 
minterm. In other words, given two minterms a, [3 such that g(8) = г-1, 
then TSUM(a,B) = r-1. If value of y is r - 1, i.e., y is a saturated 
minterm then for any other minterm 6, y + 6 = y. 

As an example, in a 2-variable 4-valued function, three minterms add 
in One position. 

2 ix] ?x242 1x} 2x241 Ix) 2y2 2 3 lxl 24243 ји] 2х2 = 3 
The first two terms form a saturated minterm, and this saturated 


minterm absorbs the third term minterm. 


2. Don’t care minterms can be produced by saturated minterm. 

In the minimization procedure, we may update a minterm а юа by 
subtracting minterm y (a — a - y), where ү 15 ће value of selected 
implicant. If à € SAT, in a succession of updates, the value of a’ may 
reach the value 0. ш that case, the algorithm will reset that minterm 


coordinate to don’t care, i.e., value r. In this way, additional values can 


be subtracted, perhaps producing a set of fewer implicants than the case 
where we require product terms to sum equal to the maximum value 


(rather than equal to or greater). 


C. DEFINITIONS USED IN ND ALGORITHM 
Definition 9: Direct Neighbors 

Let a and B be minterms with coordinates < а,,а,,...,а, > апа < b,,b,,...,b, > 
respectively. If for all i we have a; — b, except one position j such that | a,-b,| = 
] we say that a and f are direct neighbors. Given a minterm a, we use N(a) to 
denote the set of its direct neighbors. 
Observation 1: The maximum number of direct neighbors of a given minterm is 2n. 
Definition 10: Directional Neighbors 

Two minterms a and f are directional neighbors in the direction x, if a; = b; 
for all i € [1,n] such that i # j and a; # b, When b, > a; we say that f is in the 
positive direction of a, while b; « a; we say that B is in the negative direction of a. 
Observation 2: If B is a direct neighbor of a then f is a directional neighbor of a in 
the direction of x; for some i Е |1,1) 
Definition 11: Connected Minterms 

This is a recursive definition. Given a minterm @ and a minterm f, then we say 
P is a connected term of a, if 


l. B is a direct neighbor of a and either g(B) < g(a) or a € SAT. 
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2. В 15 а directional neighbor of a in direction x, and f/'s direct neighbor is 
connected to a and either g(B) < р(а) ora € SAT. 
For example, in figure 2.2 minterms 2 2х2 ?xj,1 ?xj іхд. іхі іхд, 2х2 2х2 
and 2 ?x? ?xj (pointed by arrows) are connected minterms ой 2 2х2 1х2 


(the minterm with @ sign). 





Figure 2.2: Example for Connected Minterms 

Definition 12: Connected Minterm Count 

CMC, is the connected minterm count of minterm a. It is the number of 
minterms that are connected to minterm a. 
Definition 13: Expandable Directional Count 

EDC, is the expandable directional count of minterm a. It is the number of 
directions (both positive and negative for each xj) in which a has one or more 
connected minterms. 


Observation 3: 0 < EDC, < 21. 


11 


Definition 14: Clustering Factor 
The clustering factor relative to a minterm a is defined as 
CF, = (r-1)*EDC, 4 CMC,. (2.4) 
This is a measure of the weight of all connected minterms relative to a. The 


(r-1) factor is the range, or maximum possible number of minterms, in a direction 


X; 





Figure 2.3: Map for Example 3, Step 2 of Table 3.2 


Example 3: 

In Figure 2.1 the minterm 1 іх} °x, (the minterm with @ sign) is one 
of 15 minterms and has only one connected minterm and so only one expandable 
directional neighbor, i.e. its CMC and EDC values are 1 and 1, correspondingly. 


Figure 2.3 shows that the circled implicant — 1 ?xf ?x; 


was subtracted from 
Figure 2.1. We mark a minterm with a dot in the figure because it was a saturated 


minterm in the original function map. (see Definition 8 and Figure 2.1). The 


minterm DO AU (the minterm with (2 sign) has no connected 


12 


minterms nor expandable directional neighbors and CMC, = 0, EDC, = 0. The 


clustering factors of all minterms in Figure 2.3 are listed in TABLE 2.1. 


TABLE 2.1: CF'S FOR ALL MINTERMS IN FIGURE 2.3 


ЕРЕ 
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III. iPSC/Z CONCURRENT SUPERCOMPUTER 


A. SYSTEM DESCRIPTION 

In an iPSC/2 system, a large number of processors or nodes work concurrently 
on parts of a simple problem. An iPSC/2 system consists of compute nodes and a 
front end processor, called the host. A node is a 80386 processor/memory pair. Its 
physical memory is distinct from that of the host and other nodes, i.e., distributed 
memory system. Each node runs the NX/2 operating system, and can access both the 
host file system and the iPSC/2 Concurrent File System. The host system runs UNIX 
System V operating system. 

A typical iPSC/2 application has a host program that runs on the host and a 
node program that runs on a group of allocated nodes called a cube. The host 
program executes in the UNIX environment as a process. It initializes the 
application, provides any necessary human interface, and loads the node program 
onto the nodes. Generally, a node program performs calculations, exchanges 


messages with other nodes, and sends result back to the host. 


B. SYSTEM CHARACTERISTICS 
An iPSC/2 system consists of the following units: 
е ІВМ 386 AT Host Server 


e 1.5 Gigabytes(OACIS)/100 Megabytes(Math Dept.) Harddisk space 
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e 32 Nodes(OACIS)/8 nodes(Math Dept.) each with 
- 80386 Processor 
- Weitek 1167 (OACIS)/ 80387 (Math Dept.) Math Coprocessor 
- 8 MBytes (OACIS) / 4 MBytes (Math Dept.) of Memory 
Before loading the programs to the nodes, a cube must be allocated. The cube 
may consist of all the nodes in an iPSC/2 system or a subset of the nodes, but the 


number of nodes is always a power of two; that is a k-cube consists of 2* nodes. 


C. PARALLEL PROGRAMMING 

The degree of parallelism is different from program to program. A perfectly 
parallel program is the one that requires no internode communication. In a perfectly 
parallel program, if we double the number of nodes, we halve the computation time. 
But most programs involve a mix of computation and internode communication. 
One of the goals of parallel algorithm is to develop a communication strategy that 
maximizes the time a node spends computing and minimizes the time it spends 
communicating or waiting for another node to complete a computation. 

Communication among processes in an iPSC/2 system is done with message 
passing. Nodes do not share physical memory. Messages are characterized by a 
length, a type and an ID: 


Ф Тһе message length is the length of the structure in bytes. The message 
sending routines will send exactly the specified message length. 


© The message type defines the message which a particular node is waiting for. 


There are two types of messages that can be sent; synchronous and asynchronous. 
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Another way of communicating between the nodes is by global operations. The 
global operations are high level constructs for communication among the node 
processes [See Section D]. In global operations, the results are shared between the 
nodes, so instead of sending messages from nodes to the host and then calculating 
the results, only the result of the global operation is sent to the host by one of the 


nodes. This may reduce the message traffic over the system. 


D. SUMMARY OF iPSC/2 SYSTEM CALLS 
The system calls that are used in the ND parallel algorithm and Multi-branch 


Concurrent algorithm are as follows; 


e Node identification : setpid(), myhost(), mynode(), numnodes() 
e Clock ; mclock() 

e Program loading : load() 

e Message Passing : csend(), crecv(), gisum() 


e Concurrent File System : open(), cwrite() 

System call setpid( HOST PID) is used to assign the process id of the host 
program. This id is needed for message passing between the host and the nodes. In 
our program HOST PID is defined in "pardef.h" [See Appendix A]. For message 
passing purposes, the host is considered to have a node number, which is always one 
more than the highest numbered node in the cube (or equal to the number of nodes 
in the cube). For example, the host’s node number in a 8-node cube is 8 while 0 


through 7 are used to number nodes in the cube. The call myhost() returns host’s 
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node number. The system call mynode() returns the number of the node on which 
the program is executing. This call is useful to make decisions by using the node 
number of a process [See Chapter V, Section B].The system call numnodes() returns 
the number of the nodes in the allocated cube. This call especially useful to make 
the programs general purpose. By using numnodes() the user does not have to enter 
the cube size to the program. 

The mclock() routine provides a simple mechanism to measure the time 
intervals. The system call mclock() returns the value of a counter that reflects 
relative time in milliseconds. We obtain an initial time value and interpret stop time 
to this initial value. We use mclock() only in the MCND algorithm. 

The system call load(filename, node, id) is used for loading the processes 
(filename) to the nodes. As soon as a node is loaded, it starts the execution of the 
program. The variable node is an integer which defines the node number on which 
the process will be loaded. When node is set to -1 then the load() instruction 
broadcasts to all nodes. The variable id is the process id of the program that will be 
loaded. Each node can be loaded with upto 20 processes, but in our programs we 
only used one process per node so the only process id is 0. 

The system calls csend(type, buf, len, node, pid) and crecv(type, buf, len) are 
synchronous message passing instructions. The iPSC/2 provides the asynchronous 
message passing also, but because the nodes start execution right after they are 
loaded, we need to block the processes until the message that contains the Working 


Expression Set and Coordinates of the minterm is received. With synchronous 


Де] 


message passing the node resumes execution only after the message is received. An 
asynchronous message passing could be used, but then another instruction msgwait() 
is needed to block the process to wait for the message. The variable type assigns the 
message id which that instruction is sending or waiting for. The variables buf and len 
define the address and size of the message buffer. The variable node has the same 
effect as in load(),i.e. it defines the node which the message will be sent. If it is -1, 
it broadcasts the message to all the nodes. Lastly, pid specifies the process id which 
is to receive the message. The system call gisum(x, 7, work) is one of the global 
operations. These operations accumulate data from the entire allocated cube. x is 
the pointer to the input vector to be used in the operation, after the completion of 
the operation it contains the final result. The variable n is the length of the vector 
and work is a working array for the summation. All the nodes must call the same 
routine (with their own x) for a specific operation, in our case, it is an integer 
summation and the final result is distributed to all nodes. The system call gisum() 
calculates the sum of each integer component of x across all nodes. The result is 
returned in x to every node. 

The system call open(filename* s ?, O CREAT|O RDWR|O APPEND,0644) 
opens a file and returns a file number that can be used later. The three "#" symbols 
after the file name are replaced by the node number which opens the file. 
cwrite(file no, buf, strlen(buf)) writes the data which is in the buffer to the file with 
assigned file no. To send formatted streams to the buffer, we used sprintf() 


instruction. This buffer is then written to the file. 
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IV. PARALLEL NEIGHBORHOOD DECOUPLING ALGORITHM 

The parallel neighborhood decoupling algorithm is a parallel version of the 
ND algorithm [Ref. 10]. The Parallel ND (PND) algorithm has two computational 
phases: minterm selection and implicant selection. Minterm selection is based on the 
clustering factor computation [See Chapter II Section C]. Implicant selection is 
based on Neighborhood Relative Count (NRC) computation. From all implicants 
which cover the selected minterm, the implicant that is the most loosely coupled 
(isolated) with its neighbors is chosen. This decoupling process is based on the fact 
that if we choose the most isolated implicant then we may minimize the negative 
impact for future minterm selections as well as implicant selections. 

In the ND algorithm, before selecting another Lu isolated minterm, the 
implicant that is selected should be subtracted from the expression. The update of 
the expression must be completed before the minterm selection of the next 
computation phase. We searched for a part of the algorithm that we can minimize 
the communication and maximize the time spent on computation and found that the 
CF computation was a good candidate for parallelization. The other parts of the 
algorithm, such as Neighborhood Relative Count, are not so amenable to 
parallelization, because they need much communication time compared to the 


computation which will be performed by a node. For example, in the NRC 
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computation [See section B], much time is spent executing conditional branch 
instructions. Even though, the NRC algorithm is a large static code, the dynamic 
code is not large enough, so much communication time that will be spent sending 
the data to the node where NRC procedure executes and this is not feasible. The 
main idea to parallelize the CF computation is to perform the EDC and CMC 
[Definitions 12 & 13] computations in each direction for a variable of a minterm. 
The number of nodes that is needed depends on the number of the variables. For 
each variable, we need two nodes, one for negative side of a minterm at the 
corresponding coordinate and the other for the positive side. The EDC's and CMC's 
that are calculated are summed using a global sum operation, where node #0 sends 
the total EDC and CMC values to the host. The host then asks for another 
minterm’s CF value. 

In the sequential version of the Yang and Wang algorithm, the main program 
asks for the coordinates of a minterm which has the smallest clustering factor. The 
sequential clustering factor procedure computes the EDC and CMC values for the 
negative direction of the first coordinate and then computes those values for the 
positive direction of the same coordinate. Then, the EDC and CMC values of the 
second coordinate are computed for the negative and positive directions. This 
procedure is applied to all consecutive coordinates,i.e. variables. The results are 
summed up and CF is calculated. When the number of the variables is increased, 
we have more coordinates to compute. This computation scheme is depicted in 


Figure 4.1. 
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Figure 4.1: Flowchart of Sequential ND Algorithm 


The parallel version of the ND algorithm has a different approach to the 
clustering factor computation. We still need the EDC and CMC values for the 
negative and the positive directions of the coordinates of the selected minterm. The 
parallel version loads the codes needed to calculate the negative and positive 
directions of a coordinate to the nodes. For a 3 input variable expression, 6 nodes 
are required. The allocation of the nodes is shown in the Figure 4.2. The dummy 
nodes in Figure 4.2 are explained at the end of section A. 

The main benefit from the parallel algorithm comes when we increase the 
number of the variables. The time needed in the sequential computation 15 
proportional to the number of variables. Figure 4.1 shows that when we have more 


variables, the algorithm will grow vertically requiring spend more computation time. 
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Figure 4.2 shows that when we have more input variables, the algorithm can expand 
horizontally (until we run out of nodes). Thus, the parallel algorithm will not spend 


as much time as the sequential algorithm to compute a clustering factor. 


Select Minterm 
Node no O 6 2 | 












CF « Y EDC * Y ОМО (HOST) 


Figure 4.2: Flowchart of Parallel ND Algorithm 


The ND Algorithm is listed below. In this algorithm, f denotes the function 


to be minimized. 
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MS, : Original Expression Set 
WS : Working Expression Set 


SS : Solution Set 


[S жжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжж 
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{ 
95 < о; /* SS = Solution Set */ 
WS = MS, = { о |а 15 generated by the function f; if a E SAT then mark its 
coordinate }. 
While WS * о до { 
1. Use algorithm CF PAR to select a minterm a from the WS. 
2. Use algorithm N to select an implicant I, that covers a. 
3. 55 – 55. 0 |. 
4. VB € I, do 
compute g(8) < g(B) - 8(@). 
subtracted I, from WS. 
if B is originally marked and g(f) — 0 then g(B) <r. 
/* don't care terms */ 


} 


The search space of the algorithm can be represented as a tree where each 
node represents the current working expression set and each edge corresponds to 
an implicant selection. The root of the search tree is the original expression set or 


MS,. 
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A. ALGORITHM CF PAR: MINTERM SELECTION 

The ND Parallel algorithm computes the clustering factor for all minterms in 
a working expression set. The number of nodes that is actually needed is (2 * 
number of variables in the expression), and the system allows only power of 2 
number of nodes to be allocated. For example, even though we need only 10 nodes 
for 5 input variables, we have to allocate 16 nodes. 

The host program ND PARY() loads the first half of the allocated nodes with 
the program which computes the negative direction of a coordinate (cf left) and the 
second half with the program for the positive direction (cf right). For each Working 
Expression Set, the most isolated minterm's coordinates are requested. The host 
program loads the current working set onto an message array. This array is defined 
by "pardef.h" and consists of the expression and the coordinates of the selected 
minterm. A minterm is selected from the working set and its coordinates аге 
assigned to the message array. The message is broadcasted to the nodes by using 
synchronous message passing. The host program then blocks on a receive instruction 
waiting for the results. 

After the nodes are loaded, the node programs start execution. Nodes block 
on a receive instruction and wait for the message from the host. After they receive 
the message from the host, they compute their assigned coordinates using the system 
call mynode(). For example, for a 4 input variable expression, 8 nodes are allocated. 
The nodes from 0 to 3 compute the negative direction of the coordinates (X1 


through X4) while nodes 4 to 7 compute in the positive direction for the 
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coordinates. If the number of nodes needed is less than the allocated nodes, then 
the extra nodes become dummy nodes. All nodes checks their allocated coordinates, 
and if the coordinate is larger than the number of variables in the expression, they 
return 0 for both EDC and CMC values. All EDC and CMC values computed on 
the nodes are summed by using global summation gisum(). The result is available 
to all the nodes. Node #0 has a special assignment of sending the result to the host. 
The host calculates the CF using EDC and CMC that are reported by node #0. The 
host then selects another minterm from the working expression set. The above 
algorithm is applied recursively until the CF values of all minterms in the working 
expression set are computed. 
The computation of CF is as follows: 
уж жжке 

WS: Working Expression Set 

X; Coordinates of a minterm a 


[* ME A He He He Ae he kc oic He ic He ic oic aic aic oic ic oc oic ic oc ic aic oc aic ic od aic oic oic aic oic aic aic oc aic aic alc adc kc oic kc oc aic aic oc aic oic oic kc oic okc okc aic okc oc oc oc oic oc oie x 


Host Program 


Фееегоегееее 


message to node -— WS 
V a € WS do { 


message to node — Х, 
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send (message to node to all nodes) 
recv (message from node from node 0) 
CF <= message from node.dea * (radix -1) + message from node.ea 
(Өш (СЕ > СЕБУ 
Сиг СЕ — СЕ 


Savecoord <- X; 


) 


return the coordinates of the minimum CF minterm 


Node Program (CF left) 

EDC < 0 

CMC < 0 

recv (message_from_node from host) 

variable number — mynode() /* assign node number as coordinate */ 

if (variable number « message to node.nvar) ( /* if the node number is bigger than 

the number of variables do not compute */ 

Compute EDC and CMC to the left of the coordinate 

) 

globalsum (Add EDC and CMC values for all nodes) 


if (mynode = 0) { 
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send (message from node to host) /* Total EDC and CMC values From all 


nodes */ 


Node Program (CF right) 
EDC < 0 
CMC < 0 
recv (message_to_node from host) 
variable number «- mynode() - numnodes/2 /* corrects and assigns the coordinate 
*/ 
if (input variable « message to node.nvar) 1 


Compute EDC and CMC to the right of the coordinate 


globalsum (Add EDC and CMC values for all nodes) 


B. ALGORITHM N: NEIGHBORHOOD RELATIVE COUNT 

The purpose of Algorithm N is to choose the most "isolated" implicant (I,) and 
update the working set WS. It computes the neighborhood relative count (NRC) for 
all implicants that cover the minterm a. The implicant with the smallest NRC is 
chosen. In other words, NRC is a measure if the coupling strength of an implicant 
with its neighbors. To select an implicant (which is equivalent to breaking the 


coupling between that implicant with its neighbors), the candidate implicant should 
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have the smallest coupling strength with its neighbors. Therefore, the ND algorithm 
tends to choose the most "isolated" implicant. If there is a tie in selecting the I,, the 
ND algorithm chooses the one which covers the largest area. The computation of 
NRC for a given implicant is described as follows: 

1. Initialize the NRC to zero. 

2. Check all neighboring minterms of the implicant and increment or 
decrement its NRC according to the following (intuitively stated) rule, which is, if 
the coupling strength between covered and uncovered area is weak (good for further 
decoupling), Algorithm N decreases NRC, otherwise increases NRC. 

/% жзхежжк жж жж жж жж жж жж же жж жж жж жж жж ж жж ж UI IACI 

a: the chosen minterm from algorithm CF PAR 

M: the set of minterms which was covered (generated) by the chosen implicant 
(L). 

N(B): the set of direct neighbors of minterm f. 


жжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжж жж жж ж * / 


{ 
NRC < 0; 
V B € M and f # a do { 
if(g(B) - g(a) < 0) then NRC — NRC - 2; 
) 
v B € M and Vy € N() do ( 


if(y € M and y # 0 and (y € SAT or B € SAT)) then ( 


28 


if (g(B) - g(a) ? g(v)) then 1 
if (y € SAT) then NRC — NRC - 1; 
ее МВС < МВС + 2; 
} 
if (g(B) - g(a) < g(y)) then { 
if (g(B) = g(y)) then NRC < NRC + 2; 
if (y € SAT and g(y) - g(8) < 0) then 
NRC < NRC + 2; 
else { 
if (g(B) > g(a) and g(B) * g(y)) then { 
if (B € SAT) then NRC < NRC - 1; 
| else NRC <- МЕС + 2; 
) /* end if */ 
) /* end else */ 
) /* end if */ 
if (g(8) - g(a) = g(y)) then { 
if (g(y) > 0 or B € SAT) then 
МЕС < NRC - 1; 
else NRC — NRC - 2; 
) 
)/* end if */ 
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if (M = {a}) then { 
if (a € SAT) then NRC < 2; 
else if (NRC < 0) then NRC < 1; 
} 


else NRC — МЕС + 2;) 





Figure 4.3: Third Step of minimization for the function in Example 4 


Example 4: 
The input function to be minimized is expressed as: 

Ё = 3 1х? 1хі+2 9x) ?xj«3 ixl 1х2+2 іхі 2х2%1 Ox? 9x41 jx] 1x; 
The working set, WS, is initialized to MS, and is represented in Figure 2.1. The 
clustering factors of all minterms in WS are calculated, and the first minimum CF 
is selected as o; in this case itis 1 1x} °x} . The ND algorithm computes the 
NRC for each implicant I which covers a using Algorithm N. For the WS in Figure 
2.1, implicant 1 Ox? ?x2 is selected. This implicant is added to the solution set, 


SS, and subtracted from working set, WS. The result can be seen in Figure 2.3. The 


minterm and implicant 2 ?xj ?xj is selected. (see Example 3). This implicant is 
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also added to the solution set and subtracted from working set. Because this 
implicant is a SAT, it is shown as don't care "4." in the working set. Figure 4.3 shows 
a recent WS. The clustering factor computations that is performed by different 
nodes are shown in TABLE 4.1. The minimum CF is found as 10 and it belongs to 
minterm 2 1x; !xj The implicant selected is 3 !xj !xj with an NRC (-16). 
Finally, the working set should contain value 0 (empty square) or 4.(don't care) as 


shown in Figure 4.4. 





Figure 4.4: Final Working Set 
The final minimized result which is kept in solution set (SS), g, is expressed as: 
ао=ї б °х; *2 °ху °х;,+3 1x? 1x; 
As can be seen by comparing the original function and the function resulting 


from the ND algorithm we have a 50% reduction. 
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TABLE 4.1: CMC AND EDC COMPUTATIONS FOR FIGURE 4.3 
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X1 left X1 right X2 right 
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C. COMPARISON RESULTS 




















In this thesis all testing results were obtained by running the test function on 
the iPSC/2 computers that were available to us at NPS Math Department and 
Oregon Advanced Computer Information Systems (OACIS), Oregon. Both 
computers are the same except that the 1PSC/2 at NPS has 8 nodes with 80387 
Math-coprocessor and iPSC/2 at OACIS has 32 nodes with Weitek 1137 Math- 
coprocessor. The choice of which computer to use depended the size of the 
functions we chose to minimize. For example, the iPSC/2 at OACIS was used for 


computing five-variable four-valued functions which needs 10 nodes, while the 
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iPSC/2 at NPS was used for smaller functions. For test purposes, the following 
functions are generated by using HAMLET'S test generator: 

1. Two-variable four-valued with 5 to 50 input product terms. 

2. Three-variable four-valued with 5 to 70 input product terms. 

3. Four-variable four-valued with 5 to 35 input product terms. 

4. Five-variable four-valued with 5 to 35 input product terms. 

All input functions were generated randomly. Notice that for three-variable 
four-valued expressions the number of test functions were more than the others. For 
a two-variable four-valued function after 30 input product terms, it tends to saturate 
and minimizes to one implicant. The three-variable four-valued test functions are 
used to see if the computation time is still exponentially increasing while the number 
of input terms are increased. For each case the same expression set is used to be 
minimized by both the sequential and parallel version. The minimization results are 
the same in all cases. 

For the testing of 2 variable 4 valued expressions, we used 10 different 
expression stes of 30 expressions each consisting of 5 to 50 terms. Figure 5.1 shows 
that the parallel algorithm is faster than the sequential one. It can be seen that when 
the number of terms in the expression is increased, the computation time also 
increased, but the rate of increase is less for parallel algorithm. This is especially 
true after saturation, which occurs at about 30 terms. In this case, the parallel 
computation time drops dramatically and the rate of climb decreases. The main 


reason for this decrease is that minterm selection is done only for the first working 
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set (WS) because all the minterms are saturated and one implicant covers the whole 
working set. But even for computing the first working set, all the terms in the 
expression should be added according to their coordinates. The sequential program 
does this sequentially, and while we increase the number of implicants in the 
expression, computation time also increases. The parallel algorithm works the same 
way, but the computation is divided between the nodes so the rate of increase is not 


high. 
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Figure 5.1: Comparison between Sequential and Parallel Algorithms for 
2 variable 4 valued expressions 


For 3 variable 4 valued expressions, we minimized expressions which consists 
of 5 to 70 terms. Again, each set has 30 different expression in it. Figure 5.2 shows 
that after 45 terms, computation time levels out with the parallel program 
proceeding at twice the speed the sequential program. Comparing Figure 5.1 and 
Figure 5.2 shows similarity between the two graphs. We expect that if we continue 
to increase the number of terms in the, expressions we will obtain a similar curve 


shape for 3 variable 4 valued expressions. 
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Figure 5.2: Comparison between Sequential and Parallel Algorithms for 
3 variable 4 valued expressions 


For 4 and 5 variable 4 valued expressions, we used expressions consisting of 
5 to 35 terms. As can be seen from the vertical axes of Figure 5.3 and Figure 5.4, 
there is a large difference between the computation times (which is more for 5 
variable expressions). It is easy to notice that these curves are also similar to the 
beginning of the curves for 2 and 3 variable expressions. Saturation needs a large 
number of terms for 4 and 5 variables. A 5 variable expression has a 5 dimensional 
space, and the number of terms we used was not enough to obtain significant 
saturation because the terms are randomly spaced. We expect the curves for 4 and 


5 variables to be similar to Figure 5.1 if we increase the number of terms in the 


expressions. 
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Figure 5.3: Comparison between Sequential and Parallel Algorithms for 
4 variable 4 valued expressions 
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Figure 5.4: Comparison between Sequential and Parallel Algorithms for 
5 variable 4 valued expressions 
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V. EXPERIENCES AND FUTURE DEVELOPMENTS 
The experiences with iPSC/2 and an improved algorithm are reported in this 


chapter. 


A. EXPERIENCES WITH iPSC/2 
We encountered a number of problems in using the system or adapting the 
sequential programs to a parallel system. For example, some of the instructions in 


the HAMLET are system specific and required a change to iPSC/2. 


1. Size of the Messages 

One of the problems encountered while running the ND parallel 
algorithm on the iPSC/2 was the size of the messages to be used. Pointers in the C 
language are by indirect addressing to a shared memory location. The iPSC/2 system 
is a distributed memory system, and we cannot use pointers when we need to pass 
expressions and coordinates for the minterms to the nodes. Instead, we must use 
arrays which should be predefined at the compile time. The size of the arrays are 
defined in "pardef.h" file[see Appendix A]. The array sizes are very important 
because they define the size of the messages that will be sent from host to the 
nodes. We want to keep the array sizes as small as possible to minimize the 
communication time. The structure in the program requires the number of variables 
and the number of terms in the expression to be defined in "pardef.h" file. The size 


of the terms should be twice the actual number of terms because, while the program 
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is processing the minimization, the implicants that are found are added to the 
working set with a negative coefficient for subtraction purposes. Assuming that there 
will be no minimization in the worst case, another set of terms which has the same 
size as the original set will be added to the working set. As in traditional C, 
whenever there is an alteration in the pardefh file, the program should be 
recompiled to realize the changes. This procedure did not allow us to use script 


programming and we had to run all the tests one by one. 


2. | Debugging 

There are two ways to debug a program: application checkpointing, 
system debugger. Application checkpointing is to place print instructions at different 
points within the source code and monitor the values of the variables and the flow 
of the program. For iPSC/2 this is infeasible. All the nodes and the host use the 
screen as standard output device. All the nodes are running concurrent processes, 
sometimes nodes send print messages to the screen at the same time and the screen 
is unreadable. We use this debugging method only for the host programs. 

For debugging purposes, iPSC/2 offers a debugger which is called as 
decon "Concurrent Debugger". This debugger allows users to trace the host and 
node codes. Decon was found to be very useful. However, there are two flaws that 
we encountered in using the decon. 

The debugger is not complete. Some commands are not implemented yet. 
For example, while tracing the program it is not possible to step through more than 


one line. This incapability of skipping multiple lines causes inconvenience when 
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loops are encountered. Another problem is the debugger does not display the values 
of the external variables which are widely used in ND algorithm. For example, the 
working expression set and original expression set are external variables and used 


by different procedures. 


B. AN IMPROVED ALGORITHM 

The development of PND algorithm helped us to understand the structure of 
HAMLET and to have experience on iPSC/2. This work lead us to developed 
another method, called Multi-branch Concurrent ND algorithm (MCND) as an 
alternative to the recursive sequential algorithm. 

Searching for an exact solution by using a recursive algorithm needs a large 
amount of computation time. A recursive algorithm keeps track of the minterms 
which have equal minimum clustering factors. The program saves the coordinates 
of those minterms and compute other branches to find a better solution. 

There are two flaws in the parallel version of the ND algorithm; it searches 
only one branch [See Chapter IV Section A] and uses excessive amount of message 
passing. The primary purpose of MCND algorithm is to overcome these problems. 
The MCND algorithm searches every branch of the search tree, and it only needs 
a message passing for sending original expression at the beginning of the program. 
All nodes are independent of each other and make decisions according to the rules 
in Chapter VI Section B. This may provide the fastest computation, because no 


synchronization between nodes are needed. 
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1. The Multi-Branch Concurrent ND Algorithm 

Exact optimal solution searches the entire tree space. On the other hand, 
ND searches only one path leading to a leaf in the tree space. The MCND lies 
between ND and exact solution in its operation. Its effectiveness is limited only by 
the number of computational nodes available. MCND does not guarantee an exact 
optimal solution. On the other hand, MCND is not ND nor PND. It is an extension 
of PND, since it relaxes the search tree. 

The MCND algorithm is loaded to all nodes by host. After the node 
programs are loaded, all processes start to execute and then block on a synchronous 
receive instruction, waiting for the host to send the message which contains the 
original expression set. The host program (which is a part of the HAMLET) 
converts into arrays the pointers which point to the expressions to be minimized. 
The message array contains the expression and the flags for printing the implicants 
and maps by the nodes. The host program broadcasts this message to all nodes and 
blocks itself waiting for the results from the nodes. 

The nodes which are blocked on a receive instruction continue the 
program after the message containing the original expression set is received. The 
original expression set and a working expression set are created from the 
information in the message array. The algorithm that nodes execute is the same 
algorithm as the algorithm in Chapter IV, but the CF PAR algorithm 15 replaced 


with Multi-CF (MCF) algorithm. 
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The MCF algorithm groups the nodes. At the beginning, all nodes in a 
cube are in one group with group size numnodes(). The clustering factors are 
computed for each minterm and the coordinates of the minterm which has the 
smallest CF is saved. If the program encounters a tie, then the first and the last 
minterm's coordinates are saved, i.e. even if there are more than two minterms only 
the first and the last one's branches will be searched. The first and last minterms are 
selected instead of intermediate ones, because when two minterms are far apart in 
coordinate or evaluation sequence, they may have less chance to share the same 
destiny. The reason for choosing only two branches of the tree is the expectation of 
further branching on the branches and the limited number of nodes available, 
because each node will follow another branch of the tree. 

Each node knows its node number by using system call mynode(). If there 
is only one minterm with the smallest CF, then the group stays the same and MCF 
returns the coordinate of the minterm to the main algorithm, and all nodes follow 
the same branch. If there are two or more minterms with the same smallest CF, 
then the group is divided into two. The nodes in the first group return the 
coordinates of the first minterm, while the second group returns the last one. All 
nodes arrange their group start, end, and size variables accordingly. After the 
implicant is subtracted in the main algorithm, the main algorithm requests another 
most isolated minterm coordinate, and the nodes compute the new working 
expression set. If there are more than two minterms with the same smallest CF, for 


the first group, it divides into two groups again and returns the coordinates of the 
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minterms, which are different on half of the group. The same procedure is applied 
to the other half of the first group which follows another branch. A group size of 
1 indicates that we do not have nodes for further division. At this point, the 
algorithm returns the first minterm's coordinates to the main algorithm of node 


program. 


/* жжжжжтжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжЖжжжжжжЖЖЖжЖЖжЖЖжЖЖжЖЖжЖжЖЖЖ 


М5, : Original Expression Set 
WS : Working Expression Set 
SS : Solution Set 


MAX INT : Maximum Integer Number 


/% жжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжжж жж жж 


{ 
SS <- o; /* SS = Solution Set */ 


СОВ СЕ «- МАХ INT 


CUR CF2 — MAX INT 
mygroup start < 0 
mygroup size < numnodes() 


mygroup end <- mygroup size - 1 
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WS = MS, = { a|a is generated by the function f; if a € SAT then mark its 
coordinate }. 
While WS z ø do { 
1. Use algorithm MCF to select a minterm a from the WS. 
2. Use algorithm N to select an implicant I, that covers a. 
3. 55 = 5501. 
4. VB € I, do 1 
compute g(B) — g(B) - g(a). 
subtracted I, from WS. 
if B is originally marked and g(8) = 0 then g(B) = r. 


/* don’t care terms */ 


} 
} 
} 
ALGORITHM MCF 
Va E WS do { 


Compute CF /* Compute the CF for minterm a 
if (CF < CUR CF) ( /* if CF of minterm a is less than current CF, then 
CUR CF — CF X /* assign the CF to CUR CF and save minterm a's 


зауесоога1 < X, /* coordinates to savecoord1 
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elseif (CF = CUR_CF) {/* if CF of minterm a is the same with current CF 
CUR CI2 — CF /* then assign it to CUR CF2 and save its 


savecoord2 «— X, /* coordinates 


) 
if (CUR CF ж СОК СЕ2) /* if saved values of Cfs are not the same then 


return(savecoordl) /* there is only one smallest CF and return its 
/* coordinates 


/* if two CUR Cfs are the same then we have a tie 


~ 


* each node get its node number and calculates the first half of the group 
/* if the node number is in the first half it returns the first coordinates 
/* and reassigns the group variables 
elseif (mynode() > (mygroup_start+mygroup_size/2)) { 
mygroup_start <- (mygroup_start+ mygroup_size/2) 
mygroup size < mygroup size/2 
return(savecoord1) 
) 
/* if the node is not in the first half it returns the coordinates of the 
/* second minterm a and reassigns the group variables for that node 
else { 
mygroup end <- mygroup_start +(mygroup_size/2-1) 


mygroup size «— mygroup size/2 
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return(savecoord2) 


In the command line used to invoke the program, there are three flags 


2" 


that can be set, "-m", "-i" and "-o". These flags allow the user to print the Karnough 
maps (-m), and the CF of the minterm and NRC of the implicant. The iPSC/2 uses 
a concurrent file system which allows each individual node to open its own files with 
node number as suffix. The "-o" flag specifies the name of the output file. These files 
provide the execution trace to the user. 
The main algorithm of each node sends a message to the host program. 

This message includes the number of the node which sends the message, the number 
of implicants which is minimized, the ratio of the minimization and the time spent 
for computation. The host program sorts the results and picks the result, which has 
the maximum ratio as the solution. The computation time is defined as the 
computation time of the node which spent the maximum time. 
Example 5: 

Assume we have a 8-node cube. Let the original expression be sent to all nodes 
by message passing from the host. At the beginning, all nodes are assigned as one 
group. The MCF algorithm on the nodes finds two minterms with equal smallest Cfs 


[See Figure 6.1]. The nodes #0-#3 assign themselves as first group and searches for 


a loosely coupled implicant for CF, and the nodes #4-#7 search for CF,,. 
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Nodes #0-#3 compute three equal smallest Cfs (CF,,, CF,, and CF,,) and 
select the first and third ones for searching. They divide into two groups again, and 
the first group which consists of node #0 and #1 computes two more CFs (CF,,,, 
CF,,,).Node #0 follows the CF,,, and finds a solution after finding the CF,,,,. This 
solution is the same as the solution that is computed by ND algorithm. Node #1 
searches for the CF,,, and computes another CF (CF,,,,), the group is out of nodes 
so even though it finds more than one CF it will only follow the first one. 

Nodes #4-#7 compute CF,, and C,,, CF,, leads the algorithm to an optimum 
solution. Node #4 and #5 compute CF,,, and reaches a solution. After all nodes are 
finished their tasks, they all report their solution and computation results to the host 
program. The host program selects the minimum result as a solution and the 
maximum computation time as the computation time of the expression. 

Example 6: 

We tested 100 2 variable 4 valued expressions using the ND algorithm and the 
MCND algorithm. For four expressions, the MCND algorithm did better than the 
ND algorithm. One of them is selected as an example. The input expression to be 
minimized is expressed as: 

Ғ-2 2х2 2х2%3 Ont їхї+1 lxi дх2+3 2х{ leatl Oxi Oer Ne 


21 


1 ix. 2. esl 


х2%1 ?xj їхї+1 Зх ?9xj41 ?x? ?xl 
The working expression set is initialized to MS, and the original expression is 
represented in Figure 6.2. The CF values of all minterms in the working set are 


computed. CF value 4 is found for minterms 2 ?xj ?x2 and 1 ?xj ?xj. The 
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Figure 6.2: Original expression map for Example 6 


nodes are divided into two groups. The first group follows the first minterm and the 
second group follows the second. The first group finds only one smallest CF and 
computes the same implicant. WS,, has a tie again and the nodes in the first group 
are divided into two. Nodes #0 and #1 find a solution consisting of 6 implicants. 
This solution is the same solution as ND and PND algorithms [See Appendix D]. 
The nodes #2 and #3 find a solution which consists of 5 implicants. The second 
group of nodes is not divided, i.e. no ties. Nodes #4 - #7 find the optimum solution 
with 4 implicants. The search space and the group selections are shown in Figure 
6.3. 
The solution set for ND and PND algorithms; 

mee 2 tx, їху+1 9x) 9x21 ?x? °ху+1 ixl °х1+1 ?xj; 9xl«3 Ox) 1x; 
The optimal solution which is found by MCND; 

f = 2 "xj °ху+1 !xj 1x43 ?x? !xj«3 9x? lxl 
As can be seen, the MCND algorithm finds a better solution than the PND and ND 


algorithms. The selected minterms and implicants are reported in Appendix D. 
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SEARCH TREE FOR EXAMPLE 6 


MS 
ODES 
CR] 0-7 СЕ2 


@ OPTIMUM SOLUTION 







ВИ SOLUTION 


WS112 


CF1111 СР1121 





WS1111 WS1121 


SOLUTION 


CF11111 CF11211 


B <гмггослкт 
WS11111 SOLUTION 


] спин 
ae 6 IMPLICANT 


SOLUTION 


Figure 6.3: Search tree for Example 6 
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VI. SUMMARY AND CONCLUSIONS 

As can be recalled from Chapter III Section C, in order to derive full benefit 
of parallel processing, certain requirements must be met. Two nodes should halve 
the time needed by a single node. But this 15 possible only for node programs that 
are running completely independently on different nodes provided that no 
communication time is required. 

The ND Algorithm runs sequentially. Only until the selection and subtraction 
of an implicant from the working expression set, can the algorithm proceed to 
compute another implicant. The updating of the working expression set should be 
completed to continue the computation. Only the clustering factor computation was 
amenable to parallel execution, but this brought in the moie of communication. 
Our system was a distributed memory system; nodes cannot access the data for the 
expressions from a shared memory location. All of the information about the 
expression and the coordinates of the implicant should be passed to the nodes by 
using messages and this should be done for each and every one of clustering factor 
computation requests. Clustering factor computation does not consist of a large part 
of the dynamic code and the communication time is increased, while the number 


of terms and inputs are increased. 
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We obtained a speed-up of two in all cases. This speed-up gives us an 
advantage in computing the MVL expressions compared to all other heuristics. 

The PND Algorithm is a faster ND algorithm. The ND algorithm is a 
heuristic,i.e. it finds a near minimal solution, not an exact solution. Improving the 
ND algorithm can be done in two ways; a recursive ND algorithm or a concurrent 
ND algorithm. We chose the concurrent algorithm, because a recursive algorithm 
would need too much computation time. The Multi-branch Concurrent ND 
Algorithm is expected to spend less time to compute the solution compared to a 
recursive sequential algorithm. We expect the recursive algorithm will have a 
computation time of 

numnodes () -1 


computation time(nodeno) 
nodeno=0 


Тһе MCND algorithm uses only two message passing instructions; the first one 
broadcasts the expression to the nodes and the second one collects the results from 
the nodes. Because all results come in different times, the time spent for receiving 
the messages for the nodes is small. The MCND algorithm realizes the minimum 
communication time and maximum computation time. Even though the MCND 


algorithm is still a heuristic, the results are very close to the exact solution. 
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APPENDIX A: PND ALGORITHM PROGRAM LISTINGS 


PARDEF.H 


This file provides additional structures which is defined 
in pardef.h file. The structures defined in this file are 
only used by ndpar.c, cf left.c and cf right.c. 


#define — MSG TYPE] 1/* This msg type is for sending 
messages to the nodes */ 

#define MSG TYPE2 2/* This msg type is for receiving 
messages from the nodes */ 

#define — HOST PID  10/* process id for the host */ 

#define NODE PID  0/* process id for the node process */ 

#define NVAR 3/* number of variables in expr */ 

#define — NTERM 100/* 2*number of terms in expr */ 


typedef short msg coord; /*buffer for coord of minterm */ 


typedef struct ( /* buffer for upper and lower */ 
short lower,/* limits of terms*/ 
upper; 
)msg bound; 
typedef struct ( /* buffer for implicant*/ 
msg bound  B[NVAR] 
short coeff, 
rbc; 


jmsg implicant; 


typedef struct ( /* buffer for expression */ 
msg implicant IINTERM]; 
short radix, 
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пуаг, 
nterm; 

)msg expression; 

/* buffer for whole data to be */ 


typedef struct ( 
/* sent to nodes */ 


msg expression E; 


msg coord Х[МУАК +2]; 
int node no, 

radix, 

nvar, 

АП Тгип, 


value msg[2]; 
)msg to node; 


typedef struct ( /* buffer for msg from the node */ 


int ea, 
dea; 
)jmsg from node; 
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NDPAR.C (HOST PROGRAM) 


Z include "defs.h" 
#include <cube.h> 
#include "pardef.h" 


/* Parallel Neighborhood Decoupling Algorithm by Oral & Yang */ 


ND_PAR() 
Еа С. AE 
:function: 
- Perform the Parallel Algorithm on the input expression 
algorithm: 


Start with a working copy E work of the original 
function E orig; 


Initialize a final function E final; 


While (there are still minterms to pick) ( 
Pick a minterm X from E work; 
Pick the best implicant I for X; 
Subtract I from E work; 
Add I to E final; 
) 
:globals: 
E orig 
e flag 
m flag 
q flag 
G flag 
FO ratio 
:side effects: 
STAT 
HEUR 
E work 
E final[] 
:called by: 
main() 
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:calls: 
dealloc expr() 
dup expr() 
print terms() 
print map() 
mim() 
pick implicant() 
subtract implicant() 
print source() 


ИИН Т АН нИн НИ НИ «/ 
{ 
register 1; 
int num impl - 0, 
better found; 
int "X: 


Implicant "k 
float ratio; 


if (E_final[N_P].I != NULL) 
dealloc expr(&E final[N P]); 


# ifdef ANALYZER 
STAT = &NP stat; 
# endif 


HEUR = N P; 
dup_expr(&E_work,&E_orig); 
E_final[HEUR].nterm = 0; 
E_final[HEUR].radix = E_orig.radix; 
E final|HEUR].nvar — E orig.nvar; 
E final HEUR].I = NULL; 


if('load flag) { 

setpid(HOST PID); 

for (120 ; 1 < (numnodes()/2) ;i++) { 

load("/usr/oral/onurpar/mvlcpar/cf_left",i,0); 
load("/usr/oral/onurpar/mvlcpar/cf_right", 
i+ (numnodes()/2),0); 

} 

load _ flag = 1; 


} 
# ifdef ANALYZER 
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if (e flag) 
print terms(&E orig); 

if (m flag) ( 

printf(" Orig map (ND_PAR):\n"); 
print_map(); 


endif 
better found = 0; 


resource used(START); 
for (;;) { 


if ((X = mim(&E_work)) == NULL) { 
if (num impl « E orig.nterm) 
better found = 1; 
break; 


= pick_implicant(X); 
num_impl+ +; 
subtract implicant(I); 


ifdef ANALYZER 
if (1 flag) 
print implicant(X,I); 
if (m flag) 
print map(); 
endif 
if ($m flag) ( 
if (num impl »- E orig.nterm) 
break; 
) 
) 
resource used(STOP), 
if ("better found) { 
num impl = E_orig.nterm; 
dup expr(&(E final[N P]),&E orig); 
) 


ratio — ((double)num impl/(double)E orig.nterm); 


ifdef ANALYZER 
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if ('q flag && !G flag) ( 
if ('better found) 
printf("%-4d ND_PAR: %4d/%-4d [%4.2f %61d:%3.3ld\n", 
expr_seq,num_impl,num_impl,0.0,secs_used(),tsecs_used()); 
else 
printf("%-4d ND_PAR: %4d/%-4d %4.2f %6d:%3.3ld\n", 
expr_seq,num_impl,E orig.nterm,ratio, 
secs_used(),tsecs_used()); 
} 
Ж епай 
dealloc expr(&E work); 


static int *mim(E) 
Expression "E; 


:function: 
- Find the Most Isolated Minterm in the expression pointed to by E, and 
return its coordinates as a 
vector. 
- Local to ndpar.c 
:globals: 
radix 
nvar 
:side effects: 
STAT 
:called by: 
ND PAR() 
:calls: 
next coord() 
eval() 
vcopy() 
‘returns: 
- A vector of integers representing the coordinate of the 
most isolated minterm, or NULL if no more minterms. 
- The value at that location is also returned as the last 
integer in the vector. 


register i,j,k; 
int cur val = E->radix, 
cur_CF = MAX INT, 
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X orig|MAX VAR-2], 
К 1 = radix - 1, 
Not_all = 0, 
All trun = 0, 
TRUN - 2*R 1, 
last = 0, 
value[2], 
cf, 
ea, 
dea, 
term; 
int *X,*next coord(); 
static int 
coordMAX VAR *2], 
save coordMAX VAR-2] 
msg to node 
msg to node cf, 


msg from node 
msg from node cf, 


# ifdef ANALYZER 
STAT->calls_mim+ +; 
# endif 


for (i=0;i < E_work.nterm;i+ +) { 
msg to node cf E.I[i].coeff — E-» I[i].coeff, 
msg to node cf.E.I[i].rbc = E->I[i].rbc; 
for (j=0;j < nvar;j++) { 
msg to node cf.E.I[(i].B[j.upper- E-» I[i]J.B[j].upper; 
msg to node cf E.I[1i].B[j].lower - E-» I[i]. B[j].lower; 
j 
} 
msg to node cf.E.radix = radix; 
msg to node cf.E.nvar = пуаг; 
msg to node cf.E.nterm - E work.nterm; 
msg to node cf.All Trun -AIl trun; 


for (term -0; term < E orig.nterm; term++) { 
Ex 1: 
while ((X=next_coord(coord,&(E->I[term]),k)) !'- NULL) 4 
vcopy(value,eval(E work,X)); 
if (value[EVAL] && value[EVAL] < radix) { 
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СЁ = 0; 


dea = 0; 
еа = 0; 
if ('value[HLV]) 
Not all - 1; 


msg to node cf.All Trun=All trun; 
for (120;i € nvar*2Z;ict-) msg to node cf.X[i]|- X[i]; 
vcopy(msg to node cfvalue msg,value); 

csend(MSG TYPEl,&msg to node cfsizeof(msg to node cf),-1,0); 


crecv(MSG TYPE3,&msg from node cfizeof(msg from node cf)); 


cf = (msg from node cf.dea * R 1) «4 
msg from node cf.ea; 


if ((value(HLV] && cf » TRUN)|| All trun) { 
if (cf < cur CF) ( 

cur_val = value[EVAL]; 

cur_CF = cf; 

for (i=0; i € nvar; i^ *) save coord[i] 7 X[i]; 


) 
) 

) 
К = 0; 

} 

if ("last && (term —- (E orig.nterm - 1)) && !Not all) ( 
All trun = 1; 
cur_CF = MAX INT; 
term = -1; 
last = 1; 

) 

) 

if (cur_CF == MAX _ INT) 
return(NULL); 


save coord[nvar-*1] = сиг СЕ; 
save coord[nvar] — cur val; 


return(save coord); 


static int valid implicant(T) 
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Implicant “1; 


ГЕНШЕ ЗИНИН С 
:function: 
- Decide upon the validity of implicant I 
- Local to ndpar.c 
:globals: 
E work 
E orig 
:side effects: 
STAT 
:called by: 
pick implicant() 
:calls: 
next coord() 
eval() 
vcopy() 
‘returns: 
1 if a valid implicant 
0 if not 
О НА НАИВНА сс лева ж/ 
{ 
ші *X; 


ші init = 1; 
int R_1 = radix - 1; 
int value = I->coeff; 
int Уо[2],Ум[2]; 
Static int 
coordMAX VAR-*2] 


# ifdef ANALYZER 
STAT->calls_valid_implicant+ +; 
# ~= endif 


while ((X = next_coord(coord,I,init)) != NULL) { 
ши = 0; 
vcopy(Vw,eval(&E_work,X)); 
vcopy(Vo,eval(&E orig, X)); 
if (((Vw[EVAL] «€ value) && 'Vw[HLV]) && (Vo[EVAL] « R 1)) 
return(0); 


return(1); 
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static int compute rbc(I) 
Implicant *I; 


Бы ына ы дасы ой О ЕН Ты 
:function: 
- Compute the RBC for the given implicant 
- Local to ndpar.c 
:globals: 
radix 
nvar 
:side effects: 
STAT 
:called by: 
pick implicant() 
:calls: 

next coord() 

eval() 

vcopy() 

returns: 

- an integer RBC 
“--ссесевтесесе14248 2485222. Не ВНИИ «/ 
{ 

int *X; 
int I_ value = I->coeff; 
register i; 
int value[2], 
R_1 - radix - 1, 
neighbor value[2], 
good, 
bad, 
diff, 
equal, 
neig boun, 
first, 
rbe = 0, 
init = 1; 
static int 


coord[MAX_VAR +2); 
# ifdef ANALYZER 
STAT->calls_compute_rbc+ +; 
# endif 


/* for each coordinate in the implicant ... */ 
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while ((X — next coord(coord,Linit)) !- NULL) ( 
init — 0; 
equal = 0; 
vcopy(value,eval(&E_work,X)); 
if (value[EVAL] == radix) 


continue; 
diff = value[EVAL] - I_value; 
first = 1; 


/* for each direction ... */ 
for (i=0; i < nvar; i++) { 


good = 0; 

bad = 0; 

if ((diff <= 0) && first) { 
good = 2; 
first = 0; 

) 


/* if there is a left neighbor, examine it */ 
if (X[i] != 0 && Xfi] == I->B[i].lower) 1 
1 |--, 
copy neiglibor value,eval(&E. work, X)) 
пето boun - neighbor value[EVAL] - value[EVAL]; 
X[i]+ +; 
if (neighbor_value[EVAL] != 0) { 
if (!Ineighbor value(HLV] || !value[HLV]) { 
if (neighbor value[EVAL] « diff) ( 
if (neighbor value[HL'V]) 
2004 + = 1; 
else 
bad += 2; 


) 
if (neighbor value[EVAL ] > 91) { 
if (!neig boun) 
bad += 2; 
if (neighbor_value[HLV] && neig_boun < 0) 
bad += 2; 
if (diff 2 0 && neig boun) ( 
if (value[HLV]) 
good + = 1; 
else 
bad += 2; 
) 
) 


else { 
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if (neighbor value(HLV] || value(HLV]) 
good += 1; 

else 

good += 2; 

) 


) 


/* if there is a right neighbor, examine it */ 
if (X[i] != R_1 && Xfi] == I-2B[iJ.upper) ( 
ІШ +; 
vcopy(neighbor value,eval(&E work, X)); 
neig boun - neighbor value[EVAL ] - value[EVAL]; 
X[i]-- 
if (neighbor value[EVAL] І- 0) 5 
if (Ineighbor_value[HLV] || !value[HLV}) { 
if (neighbor_value[EVAL] < diff) { 
if (neighbor_value[HLV]) 
good += 1; 
else 
bad += 2; 


) 
if (neighbor value[EVAL] > diff) { 
if ('neig boun) 
рад += 2; 
if (neighbor value[HLV] && neig boun « 0) 
Бай --- 2; 
if (diff 2 0 && neig boun) ( 
if (value[HLV]) 
2004 + = 1; 
else 
bad += 2; 
) 


) 
else { 
if (neighbor value[HLV] || value[HLV]) 
2004 += 1; 
else 
good += 2; 
) 
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) 


/* update the rbc */ 
rbc = (rbc - good) + bad; 
) 
) 
return(rbc); 
} 


static Implican *pick implicant(X) 
int NX 


:function: 
- Pick the best implicant for minterm X 
:globals: 
radix 
:side effects: 
STAT 
:called by: 
ND PAR() 
:calls: 
init implicant() 
gen bounds() 
next implicant() 
eval() 
vcopy() 
compue rbc() 
copy implicant() 
valid implicant() 


‘returns: 
- A pointer to a term representing the best implicant. 
MEME --2-................................. ж/ 
{ 
int сиг гос = MAX INT, 
Гос = 0, 
I value, 
1, 
init = 1, 
first = 1; 
Implicant ж. 
static int 


coord MAX VAR-«2J 
static Bound I bound(:MAX VAR-2] 
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static Implicant I best; 
Bound  *B; 
int Ур), 

value[2]; 


ifdef ANALYZER 
STAT-»calls pick implicant- +; 
endif 


I best.B - I bound; 

init implicant(X); 

B - gen bounds(X); 

vcopy(V,eval(&E orig, X)); 

while ((I 2 next implicant(B)) !2 NULL) ( 

if (V[HLV]) ( 
for (I->coeff=X[nvar]; I->coeff < radix; 
(I->coeff)+ +) ( 
if (valid_implicant(I)) 4 

rbc = compute_rbc(I); 
if (first) 
вое — 2; 
else 
с += 2; 
if (rbc <= cur_rbc) { 
cur_rbc = rbc; 
I->rbc = с; 
copy implicant(&I best,I); 
) 


first = 0; 
) 
else { 
I->coeff = X[nvar]; 
if (valid_implicant(I)) { 
rbc — compute rbc(I); 


if (first) ( 
first = 0; 
if (rbc < 0 ) 
dus 5 
else 
Гос += 2; 

} 
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} 


else 
tbc += 2; 
if (rbc <= cur_rbc) { 
cur_rbc = rbc; 
І->гЫс = с; 
copy implicant(&I best,I); 


} 
} 


return(&I best); 


NODE PROGRAM LISTINGS 


CF LEFT.C (NODE PROGRAM) 


#include 
# include 
# include 


main() { 


int 


"defs.h" 
"pardef.h" 
<cube.h> 


expanded, 


var_no, 


long 


vall[2]; 


еа[2], 


work1[2]; 


msg to node 


msg to node cf, 


msg from node 


msg from node cf, 


бог (2) 4 


еа[0] = 0; 

еа[1] = 0; 

expanded = 0; 
crecv(MSG_TYPE1,&msg_to_node_cf,sizeof(msg_to_node_cf)); 
var_no=mynode(); 
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if (var no « msg to node cf. E.nvar) { 


while (msg to node cf.X[var no] » 0) ( 
msg to node cf.X[var no|--; 
vcopy(vall,eval(msg to node cf.E,msg to node cf.X)); 
if (vall[EVAL] && (val1[EVAL] «— 
msg to node cfvalue msg[EVAL] 
|| msg to node cf.value msg[HLV])) { 
expanded- 1; 
еа[0]+ +; 


else break; 
) 
if (expanded) ea[1]++; 


gisum(&ea[0],2,&work1[0}); 
if (mynode() == 0) { 
msg from_node_cf.ea = ea[0]; 
msg from node cf.dea — ea[1] 
csend(MSG TYPE3,&msg from node cf;izeof(msg from node cf), 
myhost(),, HOST PID); 


) 


int  *eval(E, X) 
msg expression E; 
short X[NVAR] 


JS-..--.-- таласатын mtt = = а = а = === 
:function: 
- Evaluate the expression at X, where X is a vector of 
coordinates 
‘returns: 

- A vector with the value of the expression at the 
specified coordinate as its first element, and a flag 
set if this value has attained the highest logic value 
(HLV) 

ee */ 
( 

register 1,),К; 

int out of bounds; 


static int V[2]; 
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register rml = E.radix-1; 


V[EVAL] = 0; 
V[HLV] = 0; 
/* for each term ... */ 
for (1=0; 1 < E.nterm; i++) { 
/* for each variable ... */ 
for (j=0,out_of_bounds=0; j < E.nvar; j++) { 
if 
oie E.I[i].B[j].Jower) | | 
e E.I[i]. B[j]. upper) 


out of bounds - 1; 
break; 
} 


} 
if (out of bounds) 
continue; 


/* if this is a don't care, return the radix */ 
if (E.I[i].coeff == E.radix) { 
V[EVAL] = E.radix; 
return(V); 


V[EVAL] 4 - E.I[i].coeff; 
if (V[EVAL] >= пп1) ( 
/* set a flag which means E orig was saturated at this X */ 
V[HLV] = 1; 


if (V[EVAL] > rm1) ( 
V[EVAL] = rm1; 
else if (V[HLV] && (V[EVAL] <= 0)) { 
V[EVAL] = E.radix; 
return(V); 


) 
return(V); 
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vcopy(d,s) 


Int edes 

{ 
40) - 50); 
d[1] — s[1]; 
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CF RIGHT.C (NODE PROGRAM) 


#include — "defs.h" 
Zinclude — "pardef.h" 
#include <cube.h> 


main() { 
int 
expanded, 
var_no, 
vall[2]; 
long ea[2], 
work1[2]; 


msg to node 

msg to node cf, 
msg from node 

msg from node cf, 


for (;;) { 


еа[0] = 0; 

еа[1] = 0; 

expanded - 0; 

crecv(MSG TYPEl,&msg to node cfsizeof(msg to node cf)); 
var no-mynode() - (numnodes()/2); 


if (var no « msg to node cf.E.nvar) { 


while (msg to node cf.X[var no] « ((msg to node cf.E.radix)-1)) ( 
msg to node cf.X[var no]-* 4; 
vcopy(vall,eval(msg to node cf.E,msg to node cf.X)); 
if (va1[EVAL] && (уа ЦЕУАЈЈ <= 
msg to node cfvalue msg[EVAL 
|| msg to node cfvalue msg[HLV])) ( 
expanded- 1; 
еа[0ј+ +; 


else break; 


) 
if (expanded) ea[1]+ +; 
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} 


int 


gisum(&ea[0],2, &work1[0]); 


*eval(E, X) 


msg expression E; 
short X[NVAR]; 


--------------------------...........................-..--4Ц сйоп: 
- Evaluate the expression at X, where X is a vector of 
coordinates 
‘returns: 


- A vector with the value of the expression at the 
specified coordinate as its first element, and a flag 
set if this value has attained the highest logic value 


(HLV) 
ооо ее. ee E «/ 
register ij. k; 
int out of bounds; 
static int V[2]; 
register rm1 = E.radix-1; 
V[EVAL] = 0; 
УІНІМІ = 0; 


/* for each term ... */ 
for (120; 1 < E.nterm; i++) { 
/* for each variable ... */ 
for (j=0,out_of_bounds=0; j < E.nvar; j++) { 
if ( 
(X[j] « E-I[i].B[j].Jower) | | 
(X[j] » E.I[i].B[jJ.upper) 


) 1 
out of bounds - 1; 
break; 
} 
} 
if (out of bounds) 
continue; 
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/* if this is a don’t care, return the radix */ 
if (E.I[i].coeff == E.radix) { 
V[EVAL] = E.radix; 
return(V); 


У[ЕУАЦ] += Е. .соеЁ; 
if (V[EVAL] >= rm1) { 
/* set a flag which means E orig was saturated at this X */ 


V[HLV] = 1; 


if (V[IEVAL] » rm1) ( 
V[EVAL] = rml; 


} 
else if (V[HLV] && (V[EVAL] <= 0)) { 
V[EVAL] = E.radix; 
return(V); 


) 
return(V); 


vcopy(d,s) 

int *d,*s; 

{ 
d[0] — s[0]; 
d[1] — s[1]; 
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APPENDIX B: MCND ALGORITHM PROGRAM LISTINGS 
PARDEF2.H 


#define — MSG TYPE1 1 
#define MSG_TYPE2 2 
#define — HOST PID 10 
#define NODE PID 0 
#define NVAR 2 
#define NTERM 10 
typedef short msg _ coord; 


typedef struct { 
short lower, 
upper; 
}msg_bound; 
typedef struct { 
msg bound B[NVAR]; 
short coeff, 
rbc; 
)msg implicant; 
typedef struct ( 
msg implicant IINTERM]; 
short radix, 
nvar, 
nterm; 
int 
i flag, 
m flag; 
char of filiMAX PATH-1] 
)msg expression; 


typedef struct 4 
float ratio; 
int пит ірі, 
node no; 
long secs, 
msecs; 
jmsg from node; 


po 


MCND Algorithm Host Program Listings 


Z include "defs.h" 
#include <cube.h> 
#include "pardef2.h" 


/* Multi-branch Concurrent Algorithm (Host) by Oral & Yang ------------ dA 
ОРТ МГ() 
НИС ао оа ЕЕ асс ы 
‘function: 
- Perform the MCND Algorithm on the input expression 
EE o 0 AERE LI ereererlcicoicdes */ 
{ 
register i,j; 
int num impl = 0; 
float ratio; 


msg_expression 
msg_to_node; 
msg_from_node 
msg from node first; 
if (E final[O N].I !- NULL) 
dealloc expr(&E final[O N]); 


# ifdef ANALYZER 
STAT = &ON stat; 
# endif 


HEUR - ON; 

E finalJNHEUR].nterm - 0; 
E final|HEUR].radix = E orig.radix; 
E final|HEUR].nvar — E orig.nvar; 
E finalJáHEUR]I = NULL; 


if('load flag) ( 
setpid(HOST РІ”); 


load("/usr/oral/onurpar2/mvlepar/opt nd n",-1,0); 


load flag — 1; 
) 
# ifdef ANALYZER 
if (e_flag) 
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# 


print terms(&E orig); 
endif 


msg to node.nvar - E orig.nvar; 
msg to node.nterm - E orig.nterm; 
msg to node.radix — E orig.radix; 
msg to node.i flag — 1 flag; 
msg to node.m flag — m flag; 
strcepy(msg to node.of file,of file); 
for (120;i < E_orig.nterm;i+ +) { 
msg _to_node.I[i].coeff — E orig.I[i].coeff; 
msg _to_node.I][i].rbc = Е_ опр. .гбс; 
for (j20;j < E_orig.nvar;j++) { 
msg to node.I[iJ.B[j.upper — E orig.I[1]. B[jJ.upper; 
msg to node.I[iJ.B[jJ.Jower - E orig.I[i]. B[j].Jower; 


} 
csend(MSG TYPEIl,&msg to node,sizeof(msg to node),-1,0); 
for (i=0;1 < numnodes();i+ +) { 
crecv(MSG TYPE2,&msg from node first,sizeof(msg from node first)); 
printf("%-4d OPT_PAR: %4d/%-4d %4.2f %6d:%3.31d From node: %d\n", 
expr_seq,msg from_node first.num_impl,E orig.nterm, 
msg from node first.ratio,omsg from node first.secs, 
msg from node first.msecs,msg from node first.node no); 


} 


dealloc expr(&E work); 
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MCND Algorithm Node Program Listings 


#include "defs.h" 

#include "pardef2.h" 
#include <cube.h> 
Zinclude «fcntl.h» 


/* Global data structures -------------------------------------------------- */ 
/* Logic expressions: 


E orig 
- holds the original input expression as parsed 


E work 
- a copy a E orig 
- implicants are subtracted from this expression as terms 
during the coures of optimization 


E final[] 
- the result expression (starts out empty) 
- each term is one implicant found during optimization 
- each heuristic has its own E final (for comparison) 


*/ 


Expression 
E orig = { 0,0,0,NULL }, 
E work = { 0,0,0,NULL }, 
E final[5] 7 ( 
( 0,0,0, NULL ), 
( 0,0,0,NULL }, 
{ 0,0,0,NULL }, 
{ 0,0,0,NULL }, 
{ 0,0,0,NULL } 


ү 
int НЕСЕ; /* Current heuristic 
* HEUR indexes into E final[] 
* depending upon the currently 
* active heuristic 
”/ 
int FINAL, /* Index of the selected final 
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* expression 
и 
long mygroup start, 
mygroup end, 
mygroup size; 
int : 
char msg[100]; 
/* Multi-branch Concurrent ND algorithm for a node by Oral & Yang ----------- E 


function: 
- Performs the MCND algorithm on a node 
algorithm: 
Receive original expression set from host 
Start with working copy E work of the original function E orig 
Initialize a final function E final 
While (there are still minterms to pick) { 
Pick a minterm X from E work 
Pick the best implicant I for X 
Subtract I from E work 
Add I to E final 


main() 


register ij; 
int num impl, 
better found, 
expr seq - 0; 
static char cfs[4] = "###"; 
int *Х; 
Implicant РБ 
double ratio; 
unsigned long T1,T2, 
time; 
msg expression 
msg to node; 
msg from node 
msg from node first; 


Гог (;;) { 
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ехрг ѕед+ +; 

num impl - 0; 

mygroup start — 0; 

mygroup size — numnodes(); 
mygroup end — mygroup size - 1; 


crecv((MSG TYPEl,&msg to node,sizeof(msg to node)); 


if (msg to node.i flag | msg to node.m flag) ) ( 


strcat (msg_to_node.of_file,cfs); 
fd 2 open(msg to node.of file,O CREAT | O_RDWR | O APPEND, 0644); 


dup expr(&E, orig,&msg to node); 


if (E finalJO N].I !2 NULL) 
dealloc expr(&E final[O N]); 


HEUR = O_N; 
dup_expr(&E_work,&msg_to_node); 
E final|HEUR].nterm = 0; 
E final[HEUR].radix = E_orig.radix; 
E final|HEUR].nvar — E orig.nvar; 
E final[HEUR].I = NULL; 
if (msg_to_node.m_flag) { 
sprintf(msg," Orig map(OPT_ND):\n"); 
cwrite(fd,msg,strlen(msg)); 
print map(); 


better found - 0; 


T1 = mclock(); 
for (;;) 1 


if ((X = mim(&E_work)) == NULL) { 

if (num impl « E orig.nterm) 
better found = 1; 

break; 

) 

I — pick implicant(X); 

num impl- -; 

subtract_implicant(I); 

if (msg to node.i flag) 
print implicant(X,I); 
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if (msg to node.m flag) 
print map(); 
} 


12 = mclock(); 
time = T2 - T1; 
if (!better_found) { 
num_impl = E_orig.nterm; 
dup_expr(&(E_final[O_N]),&E_orig); 
) 


ratio — ((double)num impl/(double)E orig.nterm); 
if (ratio == 1) ratio = 0; 
msg from node first.ratio— ratio; 
msg from node first.node no-mynode(); 
msg from node first.num impl-num impl; 
msg from node first.secs- time / 1000; 
msg from node first.msecs — time - (msg from node first.secs * 1000); 


csend(MSG TYPE2,&msg from node first,sizeof(msg from node first),myhost() 
,HOST PID); 
if (msg to node.i flag | msg to node.m flag) ( 
sprintf(msg,"%-4d OPT_PAR: %44/%-44 %4.21 %6d:%3.3ld From node: 
Jod\n", 
expr_seq,num_impl,E orig.nterm,ratio,msg from_node_first.secs, 
msg_from_node_first.msecs,mynode()); 
cwrite(fd,msg,strlen(msg)); 


dealloc expr(&E work); 
close(fd); 


) 


static int *nim(E) 
Expression *E; 


‘function: 
- Find the Most Isolated Minterm in the expression pointed to 
by E, and return its coordinates as a vector. 
- Local to opt nd n.c 
:globals: 
radix 
nvar 
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:side effects: 
STAT 
:called by: 
main() 
:calls: 
next coord() 
eval() 
vcopy() 
же го: 
- A vector of integers representing the coordinate of the most 
isolated minterm, or NULL if no more minterms. 
- The value at that location is also returned as the last integer 
in the vector. 
- if there is a tie (more than one smallest CF value) it returns 
first and last, and divides the nodes into two groups. 


register 1,j,k; 

int cur val — E->radix, 
cur val2 — E-» radix, 
cur CF = MAX INT, 
сиг СЕ2 = МАХ ІМТ, 

Х опе МАХ УАК+2], 

К 1 -Е опр.гайх - 1, 
Not all = 0, 
All trun = 0, 
TRUN - 2*R 1, 
last — 0, 
expanded, 
value[2], 
val1[2], 
val2[2], 
сі, 
еа, 
dea, 
term; 

int *X,*next coord(); 

static int 
coord MAX VAR-2], 
save coordi|MAX VAR-4«2], 
ѕауе соога2[ МАХ УАК +2]; 


for (term—0; term « E orig.nterm; term 4) ( 
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К = 1; 
while ((X-next coord(coord, &(E-- I[term]),k)) !- NULL) ( 
vcopy(value,eval(E, X)); 
if (value[EVAL] && value[EVAL] « E orig.radix) ( 
if (!value[HLV]) 
Not_all = 1; 
if (All trun) ( 
сі - 0; 
деа = 0; 
ea — 0; 
for (j20; j « E orig.nvar; j*- *) X orig[j]] - ХУ); 
/* for each variable (direction)... */ 
for (j=0; j < E_orig.nvar; j++ ) { 
expanded = 0; 
/* If not on a left hand edge, move left */ 
while (X{j] > 0) { 
vcopy(vall,eval(E,X)); 
if (val1[EVAL]) { 
expanded = 1; 
еа+ +; 
) 
else break; 
) 
ХВ] - Х orig[j]; 
if (expanded) ( 
expanded - 0; 
деа + +; 
) 
/* if we didn't start on a right hand edge, move right */ 
while (X[j] « R 1) ( 
X[j]+ +; 
vcopy(val2,eval(E,X)); 
if(vaD[EVAL)]) ( 
expanded - 1; 
еа+ +; 


else break; 
) 
| X[] ^ X orig[j]; 
if (expanded) 
деа+ +; 
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/* compute the clustering factor */ 


cf — (dea * R 1) * ea; 
if (cf « cur CF) ( 
сиг уа! = value[EVAL]; 
сиг СЕ = сб 
for (120; i « E orig.nvar; i* *) save coordl[i] 2 X[i]; 


j 
else if (cf 2— cur CF) ( 
cur val2 = value[EVAL]; 
cur CF2 - cf, 
for (120; i « E orig.nvar; i^ +) save coord2[i] ^ X[i]; 


) 
) 

else ( 
cf - 0: 
dea = 0; 
ea = 0; 


for (j=0; j < E_orig.nvar; j++) X_orig[j] = X[j]; 
/* for each variable (direction)... */ 
for (j=0; j < E_orig.nvar; j++ ) { 
expanded = 0; 
/* If not on a left hand edge, move left */ 
while (X[j] > 0) { 
X[j]--; 


vcopy(vall,eval(E, X)); 
if (vall[EVAL] && (vall[EVAL] <= value[EVAL] 


|| value[HLV]) 4 
expanded - 1; 
ear; 

) 

else 
break; 


} 
X[j] 7 X orig[j]; 
if (expanded) { 
expanded = 0; 
деа+ +; 


) 
/* if we didn't start on a right hand edge, move right */ 
while (X[j] < R_1) { 
X[]-* +; 
vcopy(val2,eval(E, X)); 
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if (vaD[EVAL] && (val2[EVAL] «- value[EVAL] 


|| value(HLV))) 4 
expanded - 1; 
еа+ +; 

) 
else 
break; 
} 
X[j] 7 X orig[j; 
if (expanded) 
dea- +; 


) 


/* compute the clustering factor */ 


cf — (dea * R 1) + еа; 
if (!(value[HLV] && cf > TRUN)) { 
if (cf < cur_CF) { 

cur_val = value[EVAL]; 


cur CF - cf, 
for (120; 1 «€ E orig.nvar; it --) save coordl[i] - XT]; 
) 
else if (cf 2— cur CF) 5 
cur_val2 = value[EVAL]; 
cur_CF2 = cf; 
for (i=0; 1 < E_orig.nvar; i++) save_coord2[i] = X[i]; 
} 
) 
) 
) 
К = 0; 
} 
if (‘last && (term == (E_orig.nterm - 1)) && !Not all) ( 
All trun = 1; 
cur CF = MAX INT; 
term - -1; 
last = 1; 
) 
) 
if (cur CF 2- MAX INT) 
return(NULL); 


save_coord1[E_orig.nvar+1] = cur_CF; 
save coordl[E orig.nvar] — cur val; 
save coord2[E orig.nvar--1] — cur CF; 
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save coord2[E orig.nvar] — cur val2; 
if (cur CF !2 cur CF2) return(save coord1); 
else if (mynode() » — (mygroup start -- mygroup size/2)) ( 
mygroup start = mygroup start +mygroup size/2; 
mygroup size = mygroup size/2; 
return(save coord2); 


) 

else ( 
mygroup end = mygroup start + (mygroup size/2 -1); 
mygroup size = mygroup_size/2; 
return(save_coord1); 

) 


static int valid implicant(I) 
Implicant *I; 


De ы ҙе 2522.22... eren 
:function: 
- Decide upon the validity of implicant I 
- Local to opt nd n.c 
:globals: 
E work 
E orig 
:side effects: 
STAT 
:called by: 
pick implicant() 
:calls: 
next coord() 
eval() 
vcopy() 
returns: 
] if a valid implicant 
0 if not 
Е. * | 
{ 
int “УХ; 


int init = 1; 

int R_1 = Е orig.radix - 1; 
int value = I->coeff; 
int Vo[2],Vw([2]; 
Static int 
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соог(МАХ УАК--2); 


while ((X — next coord(coord,Linit)) !- NULL) { 
init 2 0; 
vcopy(Vw,eval(&E  work,X)); 
vcopy(Vo,eval(&E orig,X)); 
if (((Vw[EVAL] « value) && 'Vw[HLV]) && (Vo[EVAL] « R 1)) 
return(0); 


return(1); 
) 


static int compute rbc(I) 
Implicant *I; 


‘function: 

- Compute the RBC for the given implicant 
- Local to opt nd n.c 
:globals: 

radix 

nvar 
:side effects: 

SIAT 
:called by: 

pick implicant() 
:calls: 

next coord() 

eval() 

vcopy() 
returns: 

- an integer RBC 


int *X; 

int I value = I->coeff; 

register 1; 

int value[2], 

R_1 = Е orig.radix - 1, 
neighbor value[2], 
good, 
bad, 
diff, 
equal, 
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neig boun, 


first, 

Гос = 0, 

init — 1; 
static int 


coord[MAX VAR+2]; 


/* for each coordinate in the implicant ... */ 

while ((X — next coord(coord,Linit)) !- NULL) ( 
init = 0; 
equal = 0; 
vcopy(value,eval(&E_work,X)); 

if (value[EVAL] == E_orig.radix) 


continue; 
diff = value[EVAL] - I_value; 
first = 1; 


/* for each direction ... */ 
for (i=0; 1 < E_orig.nvar; i+ +) { 


good = 0; 

Бай - 0; 

if ((diff <= 0) && first) { 
good = 2; 
first = 0; 

) 


/* if there is a left neighbor, examine it */ 
if (X[i] !- 0 && X[i] == I-2B[i].lower) 1 
X[i]--; 
vcopy(neighbor value,eval(&E work, X)); 
neig boun - neighbor value[EVAL] - value[EVAL ]; 
X[i]+ +; 
if (neighbor_value[EVAL] != 0) { 
if (!neighbor value(HLV] || !'value[HLV]) { 
if (neighbor value[EVAL] « diff) ( 
if (neighbor value[HL V ]) 
2004 += 1; 
else 
рад += 2; 


) 
if (neighbor value[EVAL] > diff) { 
if (!neig boun) 


рад += 2; 
if (neighbor value[HLV] && neig boun « 0) 
bad --- 2; 
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if (diff > 0 && neig boun) ( 
if (value(HLV]) 
2004 += 1; 
else 
баа += 2; 
} 
} 


else 1 
if (neighbor value(HLV] || value[HLV]) 
good += 1; 
else 
good += 2; 
} 


} 


/* if there is a right neighbor, examine it */ 
if (X[i] !- R. 1 && X[i] 22 I-2B[i].upper) { 
ШЕ 
vcopy(neighbor value,eval(&E work, X)); 
neig boun - neighbor value[EVAL]] - value[EVAL]; 
X[i]--; 
if (neighbor value[EVAL] !- 0) ( 
if (Ineighbor value[HLV] || !'value[HLV]) { 
if (neighbor value[EVAL ] « diff) ( 
if (neighbor value(HLV]) 
2004 += 1; 
else 
бай --- 2; 
} 
if (neighbor value[EVAL ] > diff) { 
if (!neig boun) 
Бай --- 2; 
if (neighbor value[HLV] && neig boun « 0) 
bad += 2; 
if (diff > 0 && neig boun) ( 
if (value(HLV]) 
good += 1; 
else 
рад += 2; 
) 


86 


else 4 
if (neighbor value[HLV] || value[HLV]) 


good += 1; 
else 
good += 2; 
) 
) 
) 


) 


/* update the rbc */ 
rbc — (rbc - good) + bad; 


) 
) 
return(rbc); 
static Implican *pick implicant(X) 
int EX 
ее 
:function: 
- Pick the best implicant for minterm X 
:globals: 
radix 
:side effects: 
STAT 
:called by: 
Wang Yang() 
:calls: 
init implicant() 
gen bounds() 
next implicant() 
eval() 
vcopy() 
compue rbc() 
copy implicant() 
valid implicant() 
‘returns: 
- A pointer to a term representing the best implicant. 
И. */ 
{ 


int cur rbc = MAX _INT, 
rbc = 0, 


87 


I value, 
i, 


init = 1, 

first = 1; 
Implicant se 
Static int 


coord[MAX VAR+2]; 
static Bound І Бой МАХ УАК-2); 
static Implicant I best; 
Bound  *B; 
int V[2], 
value[2]; 


I best.B — I bound; 

init implicant(X); 

B - gen bounds(X); 

vcopy(V,eval(&E. orig, X)); 

while ((I 2 next implicant(B)) !- NULL) { 

if (V[HLV]) ( 
for (I-2coeff- X[E orig.nvar]; I->coeff < E_orig.radix; (1->coeff)++) { 
if (valid_implicant(I)) { 

гос = compute_rbc(I); 
if (first) 
rbc = 2; 
else 
rbc += 2; 
if (rbc <= cur_rbc) { 
cur_rbc = тс; 
I->rbc = rbc; 
copy implicant(&I best,I); 
) 


first = 0; 
} 
else { 
I->coeff = X[E_orig.nvar]; 
if (valid_implicant(I)) { 
rbc — compute rbc(I); 


if (first) ( 
first = 0; 
if (rbc < 0) 
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rbe = 1; 


else 

rbc += 2; 
} 
else 

Ыс += 2; 


if (rbc <= cur_rbc) { 
cur rbc - rbc; 
I->rbc = rbc; 
copy. implicant(&I best,I); 


) 
} 


return(&I best); 
) 


ші “еуа(Е,Х) 
Expression BE 
int *X; 


:function: 
- Evaluate the expression at X, where X is a vector of coordinates 
:globals: 
nvar 
radix 
:side effects: 
STAT 
:called by: 
mim() - pa.c 
valid implicant() - pa.c 
pick implicant() - pa.c 
mim() - dm.c 
valid implicant() - dm.c 
pick implicant() - dm.c 
-ef() 
compute rbc() 
gen bounds() 
print map() 
returns: 
- A vector with the value of the expression at the specified 
coordinate as its first element, and a flag set if this value 
has attained the highest logic value (HLV) 
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int nterm = E->nterm; 
register i,j,k; 
int out of bounds; 
static int V[2]; 
register пп1 - E orig.radix-1; 


V[EVAL] = 0; 
V[HLV] = 0; 
/* for each term ... */ 
for (i=0; i < nterm; i++) { 
/* for each variable ... */ 
for (j=0,out_of_bounds=0; j < E_orig.nvar; j++) { 
if 
| (X[j] < E->I[i].Bfj].lower) | | 
(X{j] > E->I][i].Bfj].upper) 
{ 
| out of bounds - 1; 
break; 
) 


if (out of bounds) 
continue; 


/* if this is a don't care, return the radix */ 
if (E-2I[i].coeff —-— E orig.radix) ( 
V[EVAL] = E_orig.radix; 
return(V); 


V[EVAL] + = E-»I[i].coeff; 
if (V[EVAL] >= rm1) ( 
/* set a flag which means E orig was saturated at this X */ 


V(HLV] = 1; 
) 


if (V[EVAL] » rm1) ( 
У[ЕУАГ] = rml; 


else if (V[HLV] && (V[EVAL] <= 0)) { 
V[EVAL] = E_orig.radix; 
return(V); 
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) 
return(V); 


int *next coord(coord,I,first) 


int *coord; 
Implicant *I; 
int first; 
ыыы 
:function: 
- Compute the next possible coordinate for term *I 
- If first == 1, initialize the coord vector 
:called by: 
mim() 


valid implicant() 
compute rbc() 
returns: 
- An integer vector containing the coordinates. 


static 1; 


/* if the first time through, load the vector */ 
if (first) ( 
for (120; i « E orig.nvar; i- 4) ( 

coord[i] = I->B[i].lower; 


} 

else ( 
1 = 0; 
coord[i]+ +; 
бог (2) { 


if (coord[i] > І->ВП|мррег) { 
coord[i] 7 I-- B[i].lower; 
LE 

if (i >= E_orig.nvar) 
return( NULL); 

соога[1]+ +; 

} 

else ( 
break; 

} 
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return(coord); 


Bound *gen bounds(X) 
int *Х; 


‘function: 
- Generate the permissible bounds around location X in the 
working expression 

:globals: 
radix 
nvar 
E work 
E orig 

:side effects: 

STAT 
:called by: 
pick implicant() 
:calls: 
eval() 
vcopy() 

же гл: 

- A bounds array 


Static Bound В[МАХ УАК+2]; 
int nterm — E work.nterm; 
register i,j,k; 

int value, Vw[2], Vo[2]; 

int Xp[MAX_VAR+2]; 


value — X[E orig.nvar]; 


/* for each variable (direction)... */ 

for (120; i < E_orig.nvar; i++ ) { 
/* dup the coordinate */ 

[ог (ј=0; j < E_orig.nvar; j++) Xp[j] = X[j]; 
B[i]-lower = X[i]; 
/* while not on a left hand edge, move left */ 
while (Xp[i] > 0) { 
Хр[і]--; 
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— = ари — 


vcopy(Vw,eval(&E_work, Xp)); 
vcopy(Vo,eval(&E_ orig, Xp)); 
/* if can’t expand to left .... */ 
if («((value » Vw[EVAL]) && (Vo[EVAL] « (E orig.radix-1)))) ( 
B[i.]lower — Xp[i]; 


else 
break; 
) 


/* dup the coordinate */ 
бог (|-0; | €» (E orig.nvar* 1); j* *) Xp[j] 7 X[jJ; 
ВИ] лррег = ХПј 
/* while not on a right hand edge, move right */ 
while (Xp[i] « (E, orig.radix-1)) { 
Хрпј+ +; 
vcopy(Vw,eval(&E work,Xp)); 
vcopy(Vo,eval(&E orig,Xp)); 
/* if can't expand to right ... */ 
if (!((value > Vw[EVAL]) && (Vo[EVAL] « (E orig.radix-1)))) ( 
B[i.upper - Xp[i]; 


else 
break; 
) 
) 


return (B); 


/* Working structures for picking the next implicant within bounds */ 


static Bound IB[MAX VAR -*2]/* Current bounds */ 
static Implicant I; /* Implicant */ 
static int 
I var, 
I first, 
I val; 
int X origMAX УАК+2];/* Where we start */ 


init implicant( 
Int * Y. 
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‘function: 
- Initialize the static term structure above from which successive 


implicants will be returned 

- X is the starting minterm 
:side effects: 

- The structures above 
:called by: 
pick implicant() 


int nterm — E work.nterm; 
register i; 


/* initialize the implicant */ 
I.B = IB; 
I.coeff = X[E_orig.nvar]; 
I.rbc = X[E_orig.nvar+ 1); 
for (i=0; i < E_orig.nvar; i++) { 
I.B[i.upper 7 X[i]; 
I.B[i].lower = Xfi]; 


) 
Г уаг = 0; 
I first — 1; 


I val = X[E orig.nvar]; 
for (120; i <= (E_orig.nvar+1); i++) X_origfi] 7» ХП); 


Implicant *next implicant(B) 
Bound “В: 


:function: 
- On each call, return the next implicant within bounds B 
:side effects: 
STAT 
:called by: 
pick implicant() 
returns: 
- An implicant as a term structure 


int nterm — E work.nterm; 
int Xp[MAX VAR-*2] 
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if (I first) ( 
I first — 0; 
return(&l); 


while (I var « E orig.nvar) { 


/* expand left */ 
I.B[I var].lower--; 


/* if we can't go further left, then ... */ 
if (L.B[I var].Jower « B[I var].lower) 1 


/* move back and go right */ 
І.В var].lower — X orig[I var]; 
I.B[I var].upper- +; 


/* if we can't go further right, then ... */ 
if (I.B[I var].upper > B[I_var].upper) { 


/* reset and go to the next higher dimension */ 
LB[I var.upper — X orig[I var]; 

1 уаг+ +; 

continue; 


return(&lI); 


return(NULL); 
) 


int copy implicant(dest,src) 
Implicant *dest,*src; 


ооо. НЕЕ: 
:function: 
- Copy the implicant pointed to by src to dest 
:called by: 
pick implicant() 
о. ое * / 
{ 
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register i; 


dest->coeff = src->coeff; 
dest->rbc = src->rbc; 
for (i=0; 1 < E_orig.nvar; i+ +) { 
dest->B[i].lower = src->B[i].lower; 
dest->B[i].upper = src->B[i].upper; 


subtract implicant(T) 
Implicant *]; 


‘function: 
- Add implicant I to the working expression as a negative term 
(negated coefficient) 
- Add implicant I to tthe final expression 
:globals: 
HEUR 
nvar 
:side effects: 
E work 
E final[] 


register i,term; 


term = E work.nterm; 
E work.nterm+ +; 
E_work.I = alloc_implicant(E_work.I,-(I->coeff),E_work.nterm); 
for (i=0; 1 < E_orig.nvar, i++) { 
E _work.]I[term].Bfi].lower = I->B[1].lower; 
E _work.][term].B[i].upper = I->B[i].upper; 
) 


term = E finalHEUR].nterm; 
E final|HEUR].nterm 4 +; 
E final[HEUR].I = 
alloc_implicant(E_final[HEUR].I,I->coeff,E_final[HEUR].nterm); 
for (1=0; 1 < E_orig.nvar; i+ +) { 
E final|HEUR].I[term].B[i].lower — I-» B[i].lower; 
E_final[HEUR].I[term].B[i].upper = I->Bf[i].upper; 
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) 
/* vcopy() 
- copies the value vector from s to d 
М 
vcopy(d,s) 
int *d,*s; 
{ 
40) - 5|0); 
9[1] = $[1}; 
/* memory allocation functions --------------------------------------------- E 


Implicant *alloc implicant(p,coeff,n) 
Implicant *p; 
int coeff,n; 


:function: 
- Allocate space for a term array, initializing the last element 
- If p is NULL, allocate new space 
- If p is not, realloc 
returns: 
- A pointer to the Implicant 


char *malloc(),*realloc(); 
Bound *alloc bound(); 


if (p 22 NULL) ( 
if ((p=(Implicant *)malloc(sizeof(Implicant)*n)) == NULL) 
fatal("alloc_implicant(): out of memory\n"); 
p->coeff = coeff; 
p->B = alloc_bound(); 


else { 
if ((p=(Implicant *)realloc(p,sizeof(Implicant)*n)) == NULL) 
fatal("alloc_implicant(): out of memory\n"); 
p[n-1].coeff = coeff; 
p[n-1}.B = alloc_bound(); 
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return(p); 


Bound *alloc_bound() 


оао ВоВ 2. а 
:function: 
- Allocate space for E orig.nvar bounds entries and initialize 
each bound to -1,E orig.radix-1. 
- If p is NULL, allocate new space 
:globals: 
E orig 
‘returns: 
- A pointer to the Bound array 
A ес IE EE 22222... ж/ 
{ 
Bound *p; 
char *malloc(); 
register 1; 
if ((p=(Bound *)malloc(sizeof(Bound)*(E_orig.nvar))) == NULL) 
fatal("alloc_bound(): out of memory\n"); 
for (i=0; 1 < E_orig.nvar; i++) { 
p[{i].lower = -1; 
р(1.чррег = Е огір.гайіх-1; 
return(p); 
j 
init expr() 
Ш-Н а М M ы 
:function: 
- Initialize E. work, E orig and E final 
:side effects: 
E work 
E orig 
E final 
ин”... А */ 
{ 


E_work.I = NULL; 
E orig] = NULL; 
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) 


E orig.nvar = 0; 

E orig.nterm = 0; 

E orig.radix — O; 

E final[0.]I = NULL, 
E final[1.] 2 NULL; 


dealloc expr(e) 
Expression ре; 


:function: 
- Deallocate the expression pointed to by e 


Implicant *р; 
register i; 


if (e-2I!- NULL) ( 
Юг (р = е->11=0; 1 < е->щегп; 1++) 
if (p[i].B != NULL) { 
free(p[i].B); 
P[i.B = NULL; 


free(p); 

e->I = NULL; 
} 
e->nvar = 0; 
e->nterm = 0; 
e->radix = 0; 


dup_expr(E_dest,E src) 
Expression *E dest; 
msg expression "E src; 


:function: 
- Duplicate the expression pointed to by E src by allocating as 
necessary and copying into the expression pointed to by E dest. 
- If E dest can contain E src, no reallocation is performed (this 
test is made by comparing nvar and nterm parameters, and by testing 
pointers against NULL) 
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:calls: 
alloc bound() 


Implicant Г 

Bound  *B; 

register ij; 

char *malloc(); 

int 
nterm — E dest-» nterm, 
nvar = E dest->nvar; 


if (nterm != E src->nterm) { 
if (E_dest->I != NULL) 
dealloc_expr(E_dest); 
) 


E dest-» radix — E src-» radix; 
Е еѕі->пуаг = Е 5гс-> пуаг; 
E_dest->nterm = E_src->nterm; 


if (E_dest->I == NULL) { 
if ((I=(Implicant *)malloc(sizeof(Implicant)*(E_dest->nterm))) == 


NULL) 
fatal("dup_expr(): out of memory\n"); 
for (1=0; 1 < E_srce->nterm; 1+ +) 
I[i].B = NULL; 
E dest->I = I; 
} 
else 
I = E dest->]; 


for (i=0; 1 < E_src->nterm; i+ +) { 
I{i].coeff = E_src->I[i].coeff; 
if ((E_orig.nvar != E_src->nvar) || (I[i].B == NULL)) { 
I(i.B — alloc bound(); 


for (j=0; j < E_srce->nvar; j++) { 


I[i.B[j.lower 2 E src-» I[i].B[j].Jower; 
I[(i.B[j.upper - Е src-» I[i].B[j].upper; 
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static struct tms 
T1,T2,T1a,T2a; 
resource_used(op) 


static call = 0; 
if (op == START) 
times(call= =0?&T1:&T1a); 
else 
times(call==1?&T2:&T2a); 


if (++call > 1) 
call = 0; 
} 


#ifndef HZ 
#define HZ 60 
#endif 


long secs_used() 
return((T2.tms utime-T1.tms utime)/HZ ); 
j 


long tsecs used() 


return((((T2.tms utime-T1.tms utime) %HZ) * 10001) /HZ); 


fatal(s) 

char “5; 

{ 
fprintf(stderr,"%s\n",s); 
exit(1); 

} 

print_map() 

{ 


register ij 

int X[MAX VAR42J[ 

int *V; 

for (i=0; i < E_orig.nvar; i++) X[i] = 0; 
for (i=0; 1 < E_orig.nvar;) { 


101 


У = eval(&E_work,X); 
sprintf(msg,"%s%3d%c",X[1]= =0?" ":", V[EVAL], V[HLV]?’.’?’ Р); 
cwrite(fd,msg,strlen(msg)); 
X[i]+ +; 
for (;i < E_orig.nvar;) ( 

if (X[i] >= E_orig.radix) { 


ХИ = 0; 
1 (1 < 2) 
өргіп (тазр, 10"); 
cwrite(fd,msg,strlen(msg)); 
IE 
X[i]+ +; 
else { 
i = 0; 
break; 
} 
} 
} 
} 
print_implicant(X,I) 
int "X 
Implicant "5 
J& 2l AER осм NE UE 
:function: 
- Print the Most Isolated Minterm X and the implicant selected 
to cover it I. 
:called by: 
main() 
2---- selds-esc.-.J СЫНЫ. ы TR */ 
{ 


register 1; 


if (X != NULL) { 
sprintf(msg," MIM: (%d) %2d",X[E_orig.nvar+1],X[E_orig.nvar]); 
cwrite(fd,msg,strlen(msg)); 
for (i=0; 1 < E_orig.nvar; i+ +) { 
sprintf(msg,"*X%d(%2d)",i+ 1,X[1]); 
cwrite(fd,msg,strlen(msg)); 
} 
sprintf(msg,"\n"); 
cwrite(fd,msg,strlen(msg)); 
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) 
sprintf(msg, Imp: (%d) %2d",I->rbc,I->coeff); 
cwrite(fd,msg,strlen(msg)); 
for (i=0; 1 < E_orig.nvar; i+ +) { 
sprintf(msg,"*X%d(%2d,%2d)",1+ 1,1-> B[i].lower,I-> B[i].upper); 
cwrite(fd,msg,strlen(msg)); 


sprintf(msg,'"\n\n"); 
cwrite(fd,msg,strlen(msg)); 
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APPENDIX C: TIME COMPARISON TABLES 


TABLE C.1: TWO VARIABLE FOUR VALUED TIME COMPARISON 


Number of Computation Time | Computation Time 
Input Terms for Sequential for Parallel 
Algorithm(secs.) Algorithm(secs.) 


| 5 — | 095 — | 0230 | 1300 — 
0.6420 
1.8235 

: 


а | 57e | 200 — | 270 | 
№ | ом тоже. 


6.7473 2.4200 2.7881 
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TABLE C.2: THREE VARIABLE FOUR VALUED TIME COMPARISON 

























Computation Time | Computation Time 


for Sequential for Parallel 
| Algorithm (secs) ' А (secs.) Algorithm (secs.) 


1.4222 
1.5631 
1.5367 
1.5641 
1.6715 
1.6655 
1.6682 
1.6518 
1.6717 
17025 
17340 
1.8020 


78.4003 41.2967 1.8985 


79.2020 40.8500 1.9388 


Number of 


Input Terms Ratio 








-J 
© 


CA 








105 


TABLE С.З: FOUR VARIABLE FOUR VALUED TIME COMPARISON 


Number of Computation Time | Computation Time 
Input Terms for Sequential for Parallel 
Algorithm (secs.) Algorithm (secs.) 


у 


311.2900 —— 


6.3707 4.8860 


400.2017 234.6913 


Number of Computation Time | Computation Time 
Input Terms for Sequential for Parallel 
Algorithm (secs.) Algorithm (secs.) 
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APPENDIX D: SOLUTION SETS FOR EXAMPLE 6 
SOLUTION FROM ND ALGORITIHM 


rig map (W&Y): 
1 1 1 1 
SS S. 3. 
I? 3. 2 
M 3. 2 


MIM: (4) 2*X1( 3)*X2( 2) 
Imp: (-9) 2*X1( 1, 3)*X2( 1, 3) 


нон 
corr 
ula Gil 
гы 


MIM: (4) 1*X1( 0)*X2( 2) 
Imp: (-2) 1*X1( 0, 0)*X2( 0, 3) 


ооо 
C mE 
pd mt pe 
осты 


MIM: (6) 1*X1( 2)*X2( 3) 
Imp: (-2) 1*X1( 2, 2*X2( 0, 3) 


оомо е 
corr 
ььБо 
осты 


MIM: (4) 1“Х1( 1)“Х2( 0) 
Imp: (-2) 1*X1( 1, 1)*X2( 0, 1) 


с» OWN © 
зеље 
aeaeo 
e e". 
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MIM: (4) 1*X1( 3)*X2( 0) 
Imp: (-2) 1*X1( 3, 3)*X2( 0, 1) 


= о ~ с 
ccc 
>> 
ooo 


MIM: (6) 2*X1( 0)*X2( 1) 
Imp: (0) 3*X1( 0, 3)*X2( 1, 1) 


C е 
ооо 
ь р Р о 
oodo 


1 W&Y: 6/10 0.60 0:640 
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SOLUTION FROM MCND NODE #0 АХО #1 


MIM: (4) 2*X1( 3)*X2( 2) 
Imp: (-9) 2*X1( 1, 3)*X2( 1, 3) 


m m UG) m 
о о – н 
"<< 
о = = 


MIM: (4) 1*X1( 0)*X2( 2) 
Imp: (-2) 1*X1( 0, 0)*X2( 0, 3) 


Sono 
о о = m 
йла 
coetu 


MIM: (6) 1*X1( 2)*X2( 3) 
Imp: (-2) 1*X1( 2, 2)*X2( 0, 3) 


0101 
2. 1. 4. 1 
0 0 4. 0 
0 0 4.0 


MIM: (4) 1*X1( 3)*X2( 0) 
Imp: (-2) 1*X1( 3, 3)*X2( 0, 1) 


0100 
2. 1. 4. 4 
0.0 4.0 
0.0 4.0 


MIM: (4) 1*X1( 1)*X2( 0) 
Imp: (-2) 1*X1( 1, 1)*X2( 0, 1) 
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Sono 
coho 
bade 
>>> 


MIM: (6) 2*X1( 0)*X2( 1) 
Imp: (0) 3*X1( 0, 3)*X2( 1, 1) 


ре 
зе => 


0 
4. 
0 
0 


cone 


1 ОРТ РАК: 6/10 0.60 11:915 From node: 0,1 
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SOLUTION FROM MCND NODE #2 AND #3 


rig map(OPT_ND): 


= m w = O 
NN & = 
Ww Ye 
t2 t (9250 


MIM: (4) 2*X1( 3)*X2( 2) 
Imp: (-9) 2*X1( 1, 3)*X2( 1, 3) 


pi pmi > m 
© © M m 
Ci — 
oo = 


MIM: (4) 1*X1( 0)*X2( 3) 
Imp: (-2) 1*X1( 0, 0)*X2( 0, 3) 


— 
оо - н 
uS = 
о о ~ m 


MIM: (6) 2*X1( 00*X2( 1) 
Imp: (0) 3*X1( 0, 3)*X2( 1, 1) 


ооо 
оо ь = 
жї Же а 
о о Р 


MIM: (5) 1*X1( 3)*X2( 0) 
Imp: (-4) 1*X1( 1, 3)*X2( 0, 1) 


0 


ooh oO 
Rem 
>>> 


4. 
0 
0 


MIM: (5) 1*X1( 2)*X2( 3) 
Imp: (-2) 3*X1( 2, 2)*X2( 1, 3) 
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зеље 
ooh © 
== = 
зеље 


1 ОРТ РАК: 5/10 0.50 11:241 From node: 2,3 
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SOLUTION FROM MCND NODE #4 THROUGH #7 


MIM: (4) 1*X1( 0)*X2( 3) 
Imp: (-10) 1*X1( 0, 3)*X2( 0, 3) 


oon © 
E m МЮ СО 


MIM: (4) 1*X1( 1)*X2( 2) 
Imp: (-6) 1*X1( 1, 3)*X2( 1, 3) 


оомо 
ооо 


MIM: (5) 1*X1( 2)*X2( 3) 
Imp: (-4) 3*X1( 2, 2)*X2( 1, 3) 


осмос 
©©© 
>>? © 
оо о 


MIM: (6) 1*X1( 3)*X2( 1) 
Imp: (-4) 3*X1( 0, 3)*X2( 1, 1) 


~ ~ > е 
© © > o 


0 
4. 
0 
0 


>>> 


1 ОРТ РАК: 4/10 0.40 10:014 From node: 4,5,6,7 
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10. 
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